Tristan

Posted on Nov 27, 2021

Java Data Structures and Algorithms. Stacks

#java #computerscience #tristan

Introduction

This series is going to be dedicated to learning and understanding data structures and algorithms in Java. All the information I am using is coming from the book, Data Structures and Algorithms in Java by Michael T. Goodrich and Roberto Tamassia and the California State Polytechnic University course, which is free and can be found HERE, shout out professor Tang. Please do not buy the book it is expensive and can be found online for MUCH cheaper.

YouTube version:

YouTube version of this code can be found HERE

GitHub code

GitHub for the code can be found HERE

What is a abstract data type?

Before we can talk about stacks we need to understand what an abstract data type is. Also, I should point out that a stack is considered an Abstract Data Type(ADT), what is an abstract data type? Good question, an abstract data type is a purely theoretical entity that we as programmers use to simplify our understanding of algorithms and data structures. A ADT allows us to understand what something does without actually understanding how it does it. For example, when we say a stack is an ADT it lets us understand what methods it has and what principles it follows without having to go into implementation details. Later in this post we will implement an array based stack and in the next blog post we will implement a linked list based stack.

What is a Stack?

A stack is a collection of objects that are inserted and removed according to the last in first out(LIFO) principle. A user may insert objects into a stack at any time, but may only access or remove the most recently inserted object. The most recently inserted object is said to be at the top of the stack. A common mental model to have is a stack of books, where the books can only be removed or added from the top and to get to the bottom book, one has to remove all other books. The fundamental operations that the stack is most known for are push and pop. To add something to the stack we push it and to remove something from the stack we pop it. Stacks are some of the simplest of all the data structures but they are among the most important, as they are used in many different applications and as a tool for many different data structures and algorithms.

Methods of the stack

Formally a stack supports 2 update methods:

1) push(e): Adds element e to the top of the stack
2)pop(): Removes and returns the top element from the stack

A stack also supports 3 accessor methods:

3)top(): returns the top element of the stack, without removing it

4)size(): returns the number of elements in the stack

5)isEmpty(): Returns a boolean indicating weather the stack is empty

Stack Interface

To be able to formalize our abstraction of a stack, we define what is known as an application programming interface (API) in the form of a Java interface, which will describe the names of the methods that the ADT supports. With our interface we will also rely on Java's generics framework which will allow the elements stored on the stack to belong to any type we define. Just a quick reminder that interfaces can not be directly instantiated and that we have to implement them.

public interface Stack <E>{

    int size();

    boolean isEmpty();

    void push(E e);

    E top();

    E pop();
}

With the interface defined we can now finally create a proper implementation of a stack

Array Based Stack implementation

In this stack implementation, we store elements in an array, named data. We will orientate the stack so that the bottom element of the stack is always stored in cell data[0], and the top element of the stack in cell data[t]. For indexing, t will initially be equal to -1.
In order to create our stack based implementation we will follow 4 steps:

1) Create the class and instance variables

2) Create the constructors

3) Implement the interface

4) Implement the methods

1) Create the class and instance variables

The first thing we have to do is to create the class and the instance variables for the class.

public class ArrayStack<E> {
    public static final int CAPACITY = 1000; 
    private E[] data;
    private int t =-1;  
}

From the code above, first notice that we are using generics with E. The variable data is what is going to hold the actual array instance and t is going to be used for indexing. -1 will represent an empty array. Finally we deal with CAPACITY, which technically is not an instance variable due to the static modifier. This modifier means that CAPACITY is a class variable and it is shared across all the instances of the ArrayStack class. The final modifier indicates that value of this field can not be changed. The size of CAPACITY is arbitrary and 1000 is chosen simply because the book tells me to choose it.

2) Create the constructors

public class ArrayStack<E> {
   // rest of the code is above

    public ArrayStack(){
        this(CAPACITY);
    }
    public ArrayStack(int capacity){
        data = (E[])  new Object[capacity];
    }


}

We create two constructors which gives the user of our ArrayStack the flexibility to determine their own capacity of the array. However, if they do not choose a size for the array, then we create the array with the default CAPACITY. In the default constructor there is this(CAPACITY); which is just a way to call a constructor from within a constructor. The other constructor that takes in a integer parameter and does something a little hacky, which is (E[]) new Object[capacity];. First the new operator allocates an array of type object with enough memory for the specified capacity of type object (not hacky). Then casts back to an array of type E (E[])(hacky). If you try to switch Object out with E you will get an error that states, Type parameter 'E' cannot be instantiated directly. Which is actually a nice error message but why does it happen? Well there are actually two versions of the answer, we can call them the simple answer and the less simple answer.

The simple answer : Simply put, one of the restrictions on Generics, which can be found HERE, is that we cannot create instances of type parameters (E is a type parameter).

The less simple answer : Generics were retrofitted into the Java language in release 5. In order to maintain backwards compatibility with existing code bases, type erasures were created. What is type erasures? At compile time the Java compiler applies type erasure to replace all type parameters in generic types with their bounds or Object if the type parameters are unbounded( our E is unbound). This means that the produced bytecode contains only ordinary classes, interfaces and methods. So if we were to put new E[capacity] due to type erasure E would be replaced with Object, which would negate all of the benefits of generics and at runtime our code would be unable to tell what type we are using.

The topic of type erasure is still not 100% clear to me, HERE, is the stack overflow question I consulted for this answer. If anyone has a better understanding of type erasures please comment down below.

3) Implement the interface

In this step we will implement the interface that we defined earlier:

public class ArrayStack<E> implements Stack<E> {

}

Notice that we also use Generics on the interface. Once we have implemented the interface we will be asked to implement the methods.

4) Implement the methods

In this step we can implement the all of the methods that we have formalized through the interface:

public class ArrayStack<E> implements Stack<E> {

@Override
    public int size() {
        return (t+1);
    }

    @Override
    public boolean isEmpty() {
        return t==-1;
    }

    @Override
    public void push(E e) throws IllegalStateException{
        if(size() == data.length) throw  new IllegalStateException("Stack is full");
        data[++t] = e;

    }

    @Override
    public E top() {
        if(isEmpty()) return null;
        return data[t];
    }

    @Override
    public E pop() {
        if(isEmpty()) return null;
        E answer = data[t];
        data[t] = null;
        t--;
        return answer;
    }

}

With that, we now have a full working implementation of a Array based stack implementation. You can end this tutorial now. However, if you want to read a little more about the time and space complexities scroll down to the bottom.

Full Array based implementation

public class ArrayStack<E> implements Stack<E> {
    public static final int CAPACITY = 1000; //default array capacity
    private E[] data;                        //generic array used for storage
    private int t =-1;                       //index of the top element in the stack, -1 indicates empty

    public ArrayStack(){
        this(CAPACITY);
    }
    public ArrayStack(int capacity){
        data = (E[])  new Object[capacity];
    }

    @Override
    public int size() {
        return (t+1);
    }

    @Override
    public boolean isEmpty() {
        return t==-1;
    }

    @Override
    public void push(E e) throws IllegalStateException{
        if(size() == data.length) throw  new IllegalStateException("Stack is full");
        data[++t] = e;

    }

    @Override
    public E top() {
        if(isEmpty()) return null;
        return data[t];
    }

    @Override
    public E pop() {
        if(isEmpty()) return null;
        E answer = data[t];
        data[t] = null;
        t--;
        return answer;
    }
}

Drawbacks of the Array based implementation

Our Array implementation is simple and efficient, However, it has one major drawback. The drawback being the fixed capacity of the array. If we do not know the length of our array before hand, we could end up wasting a lot of space or get a illegal state exception when we exceed its limits. We will see in the next implementation how a linked list provides a solution to this problem.

Analyzing our implementation

Now I know we have not talked about time and space complexities or big O notation and that has been deliberate(I will make tutorials eventually). However, If you want a quick overview, checkout Professor Tang's lecture notes, HERE. shout out Professor Tang. Now lets get to analyzing.
All the methods inside of our array implementation operate in constant time O(1), because each method has a fixed number of operations. Which is good because it is a general rule of thumb that we want constant or logarithmic time for data structure operations and linear or n-log-n time for algorithms. For space complexity, our implementation is O(N) with N being the size of the array. O(N) is also called linear.

Conclusion

Thank you for taking the time out of your day to read this blog post of mine. If you have any questions or concerns please comment below or reach out to me on Twitter.

CodeNewbie Community 🌱