Collections and Encapsulation in Java Never, ever return null when you're supposed to be returning a Collection.

A core tenet of object oriented programming is encapsulation: callers should not have access to the inner workings of a class. This is something that newer languages such as Kotlin, Swift, and Ceylon have solved well with first-class properties.

Java—having been around for quite awhile—does not have the concept of first-class properties. Instead, the JavaBeans spec was introduced as Java’s method of enforcing encapsulation. Writing JavaBeans means that you need to make your class’s fields private, exposing them only via getter and setter methods.

If you’re like me, you’ve often felt when writing JavaBeans that you were writing a bunch of theoretical boilerplate that rarely served any practical purpose. Most of my JavaBeans have consisted of private fields, and their corresponding getters and setters that do nothing more than, well, get and set those private fields. More than once I’ve been tempted to simply make the fields public and dispense with the getter/setter fanfare, at least until a stern warning from the IDE sent me back, tail between my legs, to the JavaBeans standard.

Recently, though, I’ve realized that encapsulation and the JavaBean/getter/setter pattern is quite useful in a common scenario: Collection-type fields. How so? Let’s fabricate a simple class:

public class MyClass {

    private List<String> myStrings;

}

We have a field—a List of Strings—called myStrings, which is encapsulated in MyClass. Now, we need to provide accessor methods:

public class MyClass {

    private List<String> myStrings;

    public void setMyStrings(List<String> s) {
        this.myStrings = s;
    }

    public List<String> getMyStrings() {
        return this.myStrings;
    }

}

Here we have a properly-encapsulated—if not verbose—class. So we’ve done good, right? Hold that thought.

Optional lessons

Consider the Optional class, introduced in Java 8. If you’ve done much work with Optionals, you’ve probably heard the mantra that you should never return null from a method that returns an Optional. Why? Consider the following contrived example:

public class Foo {

    private String bar;

    public Optional<String> getBar() {
        return (bar == null) ? null : Optional.of(bar);
    }

}

Now clients can use the method thusly:

foo.getBar().ifPresent(log::info);

and risk throwing a NullPointerException. Alternatively, they could perform a null check:

if (foo.getBar() != null) {
    foo.getBar().ifPresent(log::info);
}

Of course, doing that defeats the very purpose of Optionals. In fact, it so defeats the purpose of Optionals that it’s become standard practice that any API that returns Optional will never return a null value.

Back to Collections. Much like an Optional contains either none or one, a Collection contains either none or some. And much like Optionals, there should be no reason to return null Collections (except maybe in rare, specialized cases, of which I can’t currently think of any). Simply return an empty (zero-sized) Collection to indicate the lack of any elements.

Sock drawer
The drawer still exists even when all the socks are removed, right?

It’s for this reason that it’s becoming more common to ensure that methods that return Collection types (including arrays) never return null values, the same as methods that return Optional types. Perhaps you or your organization have already adopted this rule in writing new code. If not, you should. After all, would you (or your clients) rather do this?:

boolean isUnique = personDao.getPersonsByName(name).size() == 1;

Or have your code littered with the likes of this? :

List<Person> persons = personDao.getPersonsByName(name);
boolean isUnique = (persons == null) ? false :persons.size() == 1;

So how does this relate to encapsulation?

Keeping Control of our Collections

Back to our MyClass class. As it is, an instance of MyClass could easily return null from the getMyStrings() method; in fact, a fresh instance would do just that. So, to adhere to our new never-return-a-null-Collection guideline, we need to fix that:

public class MyClass {

    private List<String> myStrings = new ArrayList<>();

    public void setMyStrings(List<String> s) {
        this.myStrings = s;
    }

    public List<String> getMyStrings() {
        return this.myStrings;
    }

}

Problem solved? Not exactly. Any client could call aMyClass.setMyStrings(null), in which case we’re back to square one.

At this point, encapsulation sounds like a practical—rather than solely theoretical—concept. Let’s expand the setMyStrings() method:

public void setMyStrings(List<String> s) {
    if (s == null) {
        this.myStrings.clear();
    } else {
        this.myStrings = s;
    }
}

Now, even when null is passed to the setter, myStrings will retain a valid reference (in the example here, we take null to mean that the elements should be cleared out). And of course, calling aMyClass.getMyStrings() = null will have no effect on aMyClass’ underlying myStrings variable. So are we all done?

Er, well, sort of. We could stop here. But really, there’s more we should do.

Consider that we are replacing our private ArrayList with the List passed to us by the caller. This has two problems: first, we no longer know the exact List implementation used by myStrings. In theory, this shouldn’t be a problem, right? Well, consider this:

myClass.setMyStrings(Collections.unmodifiableList("Heh, gotcha!"));

So if we ever update MyClass such that it attempts to modify the contents of myStrings, bad things can start happening at runtime.

The second problem is that the caller retains a reference to our underlying List. So now, that caller can now directly manipulate our List.

What we should be doing is storing the elements passed to us in the ArrayList to which myStrings was initialized. While we’re at it, let’s really embrace encapsulation. We should be hiding the internals of our class from outside callers. The reality is that callers of our classes shouldn’t care whether there’s an underlying List, or Set, or array, or some runtime dynamic code-generation voodoo, that’s storing the Strings that we pass to it. All they should know is that Strings are being stored somehow. So let’s update the setMyStrings() method thusly:

public void setMyStrings(Collection<String> s) {
    this.myStrings.clear(); 
    if (s != null) { 
        this.myStrings.addAll(s); 
    } 
}

This has the effect of ensuring that myStrings ends up with the same elements contained within the input parameter (or is empty if null is passed), while ensuring that the caller doesn’t have a reference to myStrings.

Now that myStrings‘ reference can’t be changed, let’s just make it a constant:

public class MyClass {
    private final List<String> myStrings = new ArrayList<>();
    ...
}

While we’re at it, we shouldn’t be returning our underlying List via our getter. That too would leave the caller with a direct reference to myStrings. To remedy this, recall the “defensive copy” mantra that Effective Java beat into our heads (or, at least, should have):

public List<String> getMyStrings() {
    // depending on what, exactly, we want to return
    return new ArrayList<>(this.myStrings);  
}

At this point, we have a well-encapsulated class that eliminates the need for null-checking whenever its getter is called. We have, however, taken some control away from our clients. Since they no longer have direct access to our underlying List, they can no longer, say, add or remove individual Strings. 

No problem. If we can simply add methods like

public void addString(String s) {
    this.myStrings.add(s);
}

and

public void removeString(String s) { 
    this.myStrings.remove(s); 
}

Might our callers need to add multiple Strings at once to a MyClass instance? That’s fine as well:

public void addStrings(Collection<String> c) {
    if (c != null) {
        this.myStrings.addAll(c);
    }
}

And so on…

public void clearStrings() {
    this.myStrings.clear();
}

public void replaceStrings(Collection<String> c) {
    clearStrings();
    addStrings(c); 
}

Collecting our thoughts

Here, then is what our class might ultimately look like:

public class MyClass {

    private final List<String> myStrings = new ArrayList<>();

    public void setMyStrings(Collection<String> s) {
        this.myStrings.clear(); 
        if (s != null) { 
            this.myStrings.addAll(s); 
        } 
    }

    public List<String> getMyStrings() {
        return new ArrayList<>(this.myStrings);
    }

    public void addString(String s) { 
        this.myStrings.add(s); 
    }

    public void removeString(String s) { 
        this.myStrings.remove(s); 
    }

    // And maybe a few more helpful methods...

}

With this, we’ve achieved a class that:

  • is still basically a POJO that conforms to the JavaBean spec
  • fully encapsulates its private member(s)

and most importantly ensures that its method that returns a Collection always does just that–returns a Collection–and never returns null.

 

 

When to use Abstract Classes Abstract classes are overused and misused. But they have a few valid uses.

Abstract classes are a core feature of many object-oriented languages, such as Java. Perhaps for that reason, they tend to be overused and misused. Indeed, discussions abound about the overuse of inheritance in OO languages, and inheritance is core to using abstract classes. 

 

In this article we’ll use some examples of patterns and anti-patterns to illustrate when to use abstract methods, and when not to. 

 

While this article presents the topic from a Java perspective, it is also relevant to most other object-oriented languages, even those without the concept of abstract classes. To that end, let’s quickly define abstract classes. If you already know what abstract classes are, feel free to skip the following section.

Defining Abstract Classes

Technically speaking, an abstract class is a class which cannot be directly instantiated. Instead, it is designed to be extended by concrete classes which can be instantiated. Abstract classes can—and typically do—define one or more abstract methods, which themselves do not contain a body. Instead, concrete subclasses are required to implement the abstract methods.

 

Let’s fabricate a quick example:
public abstract class Base {

    public void doSomething() {
        System.out.println("Doing something...")
    }

    public abstract void doSomethingElse();

}
Note that doSomething()–a non-abstract method–has implemented a body, while doSomethingElse()–an abstract method–has not. You cannot directly instantiate an instance of Base. Try this, and your compiler will complain:
Base b = new Base();
Instead, you need to subclass Base like so:
public class Sub extends Base {

    public abstract void doSomethingElse() {
        System.out.println("Doin' something else!");
    }

}
Note the required implementation of the doSomethingElse() method.
 
Not all OO languages have the concept of abstract classes. Of course, even in languages without such support, it’s possible to simply define a class whose purpose is to be subclassed, and define either empty methods, or methods that throw exceptions, as the “abstract” methods that subclasses override.

The Swiss Army Controller

Let’s examine a common abuse of abstract classes that I’ve come across frequently. I’ve been guilty of perpetuating it; you probably have, too. While this anti-pattern can appear nearly anywhere in a code base, I tend to see it quite often in Model-View-Controller (MVC) frameworks at the controller layer. For that reason, I’ve come to call it the Swiss Army Controller.

 

The anti-pattern is simple: A number of subclasses, related only by where they sit in the technology stack, extend from a common abstract base class. This abstract base class contains any number of shared “utility” methods. The subclasses call the utility methods from their own methods.

 

Swiss army controllers generally come into existence like this:

 

  1. Developers start building a web application, using an MVC framework such as Jersey.
  2. Since they are using an MVC framework, they back their first user-oriented webpage with an endpoint method inside a UserController class.
  3. The developers create a second webpage, and therefore add a new endpoint to the controller. One developer notices that both endpoints perform the same bit of logic—say, constructing a URL given a set of parameters—and so moves that logic into a separate constructUrl() method within UserController.
  4. The team begins work on product-oriented pages. The developers create a second controller, ProductController, so as to not cram all of the methods into a single class.
  5. The developers recognize that the new controller might also need to use the constructUrl() method. At the same time, they realize hey! those two classes are controllers! and therefore must be naturally related. So they create an abstract BaseController class, move the constructUrl() into it, and add extends BaseController to the class definition of UserController and ProductController.
  6. This process repeats until BaseController has ten subclasses and 75 shared methods.

Now there are a ton of useful methods for the concrete controllers to use, simply by calling them directly. So what’s the problem?

 

The first problem is one of design. All of those different controllers are in fact unrelated to each other. They may live at the same layer of our stack, and may perform a similar technical role, but as far as our application is concerned, they serve different purposes. Yet we’ve now locked them to a fairly arbitrary object hierarchy.

 

The second is more practical. You’ll realize it the first time you need to use one of the 75 shared methods from somewhere other than a controller, and you find yourself instantiating a controller class to do so.
String url = new UserController().constructUrl(key, value);
You’ll have created a trove of useful methods which now require a controller instance to access. Your first thought might be something like, hey, I can just make the method static in the controller, and use it like so:
String url = UserController.constructUrl(key, value);
That’s not much better, and actually, a little worse. Even if you’re not instantiating the controller, you’ve still tied the controller to your other classes. What if you need to use the method in your DAO layer? Your DAO layer should know nothing about your controllers. Worse, in introducing a bunch of static methods, you’ve made testing and mocking much more difficult.

 

It’s important to emphasize the interaction flow here. In this example, a call is made directly to one of the concrete subclasses’ methods. Then, at some point, this method calls in to one or more of the utility methods in the abstract base class.

 

 

In fact, in this example there was never a need for an abstract base controller class. Each shared method should have either been moved to its appropriate service-layer classes (if it takes care of business logic) or to a utility classes (if it provides general, supplementary functionality). Of course, as mentioned above, the utility classes should still be instantiable, and not simply filled with static methods.
 
Now there is a set of utility methods that is truly reusable by any class that might need them. Furthermore, we can break those methods into related groups. The above diagram depicts a class called UrlUtility which might contain only methods related to creating and parsing URLs. We might also create a class with methods related to string manipulation, another with methods related to our application’s current authenticated user, etc.

 

Note also that this approach also fits nicely with the composition over inheritance principal.

 

Inheritance and abstract classes are a powerful construct. As such, numerous examples abound of its misuse, the Swiss Army Controller being a common example. In fact, I’ve found that most typical uses of abstract classes can be considered anti-patterns, and that there are few good uses of abstract classes.

The Template Method

With that said let’s then look at one of the best uses, described by the template method design pattern. I’ve found the template method pattern to be one of the lesser known–but more useful–of the design patterns out there.

 

You can read about how the patterns works in numerous places. It was originally described in the Gang of Four Design Patterns book; many descriptions can now be found online. Let’s see how it relates to abstract classes, and how it can be applied in the real world.

 

For consistency, I’ll describe another scenario that uses MVC controllers. In our example, we have an application for which there exist a few different types of users (for now, we’ll define two: employee and admin). When creating a new user of either type, there are minor differences depending on which type of user we are creating. For example, assigning roles needs to be handled differently. Other than that, the process is the same. Furthermore, while we don’t expect an explosion of new user types, we will from time to time be asked to support a new type of user.

 

In this case, we would want to start with an abstract base class for our controllers. Since the overall process of creating a new user is the same regardless of user type, we can define that process once in our base class. Any details that differ will be relegated to abstract methods that the concrete subclasses will implement:
public abstract class BaseUserController {

    // ... variables, other methods, etc

    @POST
    @Path("/user")
    public UserDto createUser(UserInfo userInfo) {
        UserDto u = userMapper.map(userInfo);
        u.setCreatedDate(Instant.now());
        u.setValidationCode(validationUtil.generatedCode());
        setRoles(u);  // to be implemented in our subclasses
        userDao.save(u);
        mailerUtil.sendInitialEmail(u);
        return u;
    }

    protected abstract void setRoles(UserDto u);

}
Then we need simply to extend BaseUserController once for each user type:
@Path("employee")
public class EmployeeUserController extends BaseUserController {

    protected void setRoles(UserDto u) {
        u.addRole(Role.employee);
    }

}
@Path("admin")
public class AdminUserController extends BaseUserController {

    protected void setRoles(UserDto u) {
        u.addRole(Role.admin);
        if (u.hasSuperUserAccess()) {
            u.addRole(Role.superUser);
        } 
    }

}
Any time we need to support a new user type, we simply create a new subclass of BaseUserController and implement the setRoles() method appropriately.

 

Let’s contrast the interaction here with the interaction we saw with the swiss army controller.
Using the template method approach, we see that the caller (in this case, the MVC framework itself–responding to a web request–is the caller) invokes a method in the abstract base class, rather than the concrete subclass. This is made clear in the fact that we have made the setRoles() method, which is implemented in the subclasses, protected. In other words, the bulk of the work is defined once, in the abstract base class. Only for the parts of that work that need to be specialized do we create a concrete implementation.

A Rule of Thumb

I like to boil software engineering patterns to simple rules of thumb. While every rule has its exceptions, I find that it’s helpful to be able to quickly gauge whether I’m moving in the right direction with a particular design.

 

It turns out that there’s good rule of thumb when considering the use of an abstract class. Ask yourself, Will callers of your classes be invoking methods that are implemented in your abstract base class, or methods implemented in your concrete subclasses?
  • If it’s the former–you are intending to expose only methods implemented in your abstract class–odds are that you’ve created a good, maintainable set of classes.
  • If it’s the latter–callers will invoke methods implemented in your subclasses, which in turn will call methods in the abstract class–there’s a good chance that an unmaintainable anti-patten is forming.