Google Doodle Coding Bunny Level 6 Solution

Have you seen the google doodle celebrating 50 years of kids coding? You can find it here in case you’ve missed it:



This is a mini-game where you can program the movement of the bunny in order to eat all the carrots on the field. The game consists of six difficulty levels. To find the “shortest” solution to the last level can be challenging even for seasoned professionals. How can it be? Possibly because experienced developers try to find the optimal solution where the bunny collects the carrots in the least amount of steps. Whereas, the authors are looking for the shortest code – least amount of symbols to describe the algorithm.

Spoiler alert! Please find my solution below. It takes advantage of the fact that our bunny cannot fall into the abyss – as if there was an invisible fence around the field.

Weak Reference. What?

Q: What are weak references good for?
A: They have something to do with garbage collection.

While the answer above is certainly better than not knowing about weak references at all; any aspiring senior Java developer should know a bit more about them. Especially, when we consider that they are around since Java version 1.2. So, let’s dive a little deeper into this topic.

Strong references are ordinary Java references we use every day. For example:

Widget widget = new Widget();

Objects survive garbage collection and stay in the memory heap as long as they are reachable through strong references.

Let’s just stop here for a minute. Those who are only interested in weak references can skip this section or come back to it later. The rest of you should try to answer the question: Where does the garbage collector start working from? There are special objects called Garbage-Collection roots that are always reachable and serve as starting points for GC operations. There are the following four different kinds of such GC roots in Java:

  • Local variables kept alive by the stack of a thread.
  • Active Java threads are always considered live objects.
  • Static references are kept alive as long as the Class itself is not garbage collected.
  • JNI references are Java objects that the native code has created as a part of a Java Native Interface call. Such objects represent a very special form of GC root and are treated distinctively because the JVM has no idea if they are still referenced by the native code or not.

Weak references are not strong enough to keep the objects from being garbage-collected. You can create a weak reference like this:

WeakReference weakWidget = new WeakReference(widget);

In order to get back the actual widget, you call weakWidget.get(). Because the widget can get garbage-collected at any time, such a call can suddenly start returning null. There is a related class WeakHashMap where the keys (not the values!) are weak references and when the key gets garbage-collected the value is removed from the map automatically. This behavior makes it an ideal candidate for implementing cache functionality.

Would you like to know when a weakly referenced object has been garbage collected? This is what ReferenceQueue is for. If you pass it into the constructor of a weak reference it will be automatically put on the queue once the referenced object becomes a garbage.

There are different degrees of weakness. Soft references are between strong and weak references. They are less eager to throw away their object. In practice, they retain it as long as there is plenty of memory – but this is not a guarantee.

Phantom references are the weakest form of a reference: their get() method always returns null. Phantom references are only enqueued when the object has been already finalized. Which leads us to the only scenario I can think of in which phantom references can be useful: to determine when exactly a given object has been garbage-collected. You can create a phantom reference calling it’s only constructor:

PhantomReference(T referent, ReferenceQueue< ? Super T> q)

Did you notice something? Unlike soft and weak references, phantom references are not automatically cleared by the garbage collector as they are enqueued. An object that is reachable via phantom references will remain so until all such references are cleared or themselves become unreachable [5].

I Hope you found this post useful 😉

Resources:
[1] https://community.oracle.com/blogs/enicholas/2006/05/04/understanding-weak-references
[2] https://www.dynatrace.com/resources/ebooks/javabook/how-garbage-collection-works
[3] https://docs.oracle.com/javase/7/docs/api/java/lang/ref/WeakReference.html
[4] https://docs.oracle.com/javase/7/docs/api/java/util/WeakHashMap.html
[5] https://docs.oracle.com/javase/7/docs/api/java/lang/ref/PhantomReference.html

Elasticsearch: Getting a List of Distinct Values

I started using Elasticsearch a little more than a year ago. Elasticsearch is a distributed, open source search and analytics engine, designed for horizontal scalability, reliability, and easy management[1]. It has a very good, easy to use RESTful API so, you can use it with any web client. However, from Java I prefer to use the dedicated Java API. Recently I was working on indexing e-mails with a special focus on labels associated with the messages. Despite the Elastic API is pretty well documented, it took me a good amount of googling to figure out how to get back a list of unique labels. In Elasticsearch documents (in my case the e-mails) are organized into indices and to get back a list of distinct values of a property (e-mail labels) we need to use the so-called aggregation API. Here’s my solution:

import static java.util.stream.Collectors.toList;

import java.util.List;
import java.util.concurrent.ExecutionException;

import org.elasticsearch.action.search.SearchRequestBuilder;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.Client;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.Aggregation;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.bucket.terms.StringTerms;

public class AggregationRepository {
  private Client client;

  public AggregationRepository(Client client) {
    super();
    this.client = client;
  }

  public List getDistinctLabels() 
    throws InterruptedException, ExecutionException {
    SearchRequestBuilder aggregationQuery = 
      client.prepareSearch("emails")
        .setQuery(QueryBuilders.matchAllQuery())
        .addAggregation(AggregationBuilders.terms("label_agg")
          .field("labels").size(100));
    SearchResponse response = aggregationQuery.execute().get();
    Aggregation aggregation = response.getAggregations().get("label_agg");
    StringTerms st = (StringTerms) aggregation;
    return st.getBuckets().stream()
      .map(bucket -> bucket.getKeyAsString())
      .collect(toList());
  }
}


This is how the search query posted to the API looks like:

{
  "query" : {
    "match_all" : { }
  },
  "aggregations" : {
    "label_agg" : {
      "terms" : {
        "field" : "labels",
        "size" : 100
      }
    }
  }
}


And the response is…

{
  ...
  "aggregations": {
    "label_agg": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Gmail",
          "doc_count": 1
        },
        {
          "key": "Yahoo",
          "doc_count": 1
        }
      ]
    }
  }
}


Just to give a little bit of explanation: label_agg is the name of the aggregation – you can choose it however you wish. The results are coming back as buckets and the keys collected into a list gives us the list of unique labels. Of course, the aggregation API is capable of much more. This was just a simple example how to select distinct values from documents stored in Elasticsearch.

Resources

  1. Powering Data Search, Log Analysis, Analytics https://www.elastic.co/products

5 Tricky Questions from Java

While I was studying for the Java 8 OCA exam I’ve come across some questions that really confused me the first time I saw them. Here is a short list of five tricky questions you might expect on an exam or during a job interview.

The Assignment Operator

What will happen when you try to compile and run the following pieces of code independently?

Snippet #1

int i = 1;
i = i + 2.5;
System.out.println(i);

Snippet #2

int i = 1;
i += 2.5;
System.out.println(i);
  1. Both result in compile time error
  2. Both compile successfully and print 3
  3. Both compile successfully and print 3.5
  4. First snippet fails to compile; the second one compiles and prints 3

Answer

The first snippet fails to compile because expression i + 2.5 results in a double value that cannot be stored in an integer variable. You might think that the second code snippet does the same thing. However, that’s not the case. The assignment operator += not just combines the assignment and addition in order to look cooler but also casts the result of the right-hand side if required – in our case back to integer. So, option D is correct. Please note 3 is printed instead of 3.5 as variable i is of type integer.

Indexing Arrays

Consider the following class:

class Test{
   public static void main(String[ ] args){
      int[] x = { 1, 2, 3, 4};
      int[] y = { 0, 1, 3};
      System.out.println( x [ (x = y)[2] ] );
   }
}

What will it print when compiled and run?

  1. It will throw ArrayIndexOutOfBoundsException
  2. It will print 4

Answer

When indexing arrays, the expression to the left of the brackets is fully evaluated before any part of the expression within the brackets is evaluated. This means that the original value of x is fetched and remembered while the expression (x = y) [2] is evaluated. So, option B is correct. Value 4 is printed out from the first array.

Null Reference

What do you think will be the result of trying to compile and run the following code?

class A {
     public static void print(Object obj) {
          System.out.println(obj);
     }
     public static void main(String[] args) {
          A a = null;
          a.print(a);
     }
}
  1. Compile error
  2. NullPointerException is thrown at runtime
  3. Compiles and runs without an issue and prints null

Answer

Option C is correct. Please note that the print method in class A is static. It doesn’t matter that variable a has a null value the method is executed based on the type of the variable.

Date Manipulation

What will be printed out by running the following code?

LocalDate date = LocalDate.of(2016, Month.AUGUST, 20); 
Period period = Period.ofMonths(1).ofDays(1); 
LocalDate past = date.minus(period); 
System.out.println(past.format(DateTimeFormatter.ISO_DATE));
  1. 19 Aug 2016
  2. 2016-08-19
  3. 2016-07-19

Answer

The trick here is that Period.ofMonths(int) and Period.ofDays(int) are static methods, therefore they cannot be chained. So, the period subtracted from the date represents only one day without the month. The correct answer is option B. Please notice the ISO_DATE format too. Option A would be correct if we used DateTimeFormatter.ofPattern("dd MMM YYYY").

String Pool

What will be the result of attempting to compile and run the following code?

if ("true".replace('T', 't') == "true" 
     System.out.println("true"); 
else
     System.out.println("false"); 
  1. It will print “true” 
  2. It will print “false”

Answer

When comparing two strings with the == operator we ask whether they are pointing to the same object. The two "true" literals on the first line represent the same object from the string pool. When we call the replace method on the first "true" it returns a String object in which all occurrences of the first parameter are replaced with the second parameter. However if there is no change to be made the reference to the same object is returned, therefore option A is correct – the code above prints "true". If we changed the first line to if ("true".replace('t', 'T') == "True") the code would print "false" because the "True" returned by the replace method would be a new String object.

These were my favorite tricky Java questions. Of course, there are tons of questions that can be asked. The 5 questions above are really just the tip of the iceberg. Anyway, I hope this helps you regardless whether you are preparing for a Java exam, your upcoming job interview, or you’re just a Java enthusiast like I am. 

How (Not) To Use Prepared Statement in Java

Recently I was given a task to implement a new use case in a fairly old Java application. The task didn’t seem to be interesting at fist. There was an existing method to fetch a Task entity by its ID from a database and then it was enriched with a list of Owners fetched by the foreign key stored in the owner table. No Hibernate or another persistence framework was used at all. The underlying Sybase ASE database was accessed using plain old java.sql.PreparedStatement and the method in question looked something like this:

Task getTask(Long id) {
     Task task = null;
     Connection conn = openConnection();
     String sql = "SELECT * FROM TASK WHERE ID = ?";
     PreparedStatement stmt = conn.prepareStatement(sql);
     stmt.setLong(1, id);
     ResultSet rs = stmt.executeQuery();
     while(rs.next()) {
          task = createTask(rs);
          task.setOwners(getOwners(task.getId()));
     }
     return task;
}


The getTask(Long) functionality was exposed as web service. It worked reasonably well for many years until a new client application started calling this service in a loop to get back multiple tasks by a list of IDs. The task table contained approximately 1 million rows each task having 2-3 owners on average. The performance was horrible, it took around 60 seconds to get back 20+/- tasks with their owners.

I said: “Nothing is easier, I’ll do it in a batch plus I’ll pre-fetch all the owners in one call!”. There was one little hiccup namely that java.sql API does not support passing down a list of parameters to use them in an IN (…) clause. “No worries, I’ll implement a helper function to generate as many parameter placeholders (?) as needed and then another for setting a value for each of them”. So, my initial solution looked like this:

List<Task> getTasks(List<Long> ids) {
     List tasks = new ArrayList<>(ids.size());
     Connection conn = openConnection();
     String sql = "SELECT * FROM TASK WHERE ID IN ("+generatePlaceholders(ids)+")";
     PreparedStatement stmt = conn.prepareStatement(sql);
     setLongParams(stmt, ids);
     ResultSet rs = stmt.executeQuery();
     Map<Long, List<Owner>> ownersMap = getOwners(ids);
     while(rs.next()) {
          Task task = createTask(rs);
          task.setOwners(ownersMap.get(task.getId()))
          tasks.add(task);
     }
     return tasks;
}


The performance was much better. But still, it took 10-15 seconds to get back the results for the same list of IDs as before. It was very suspicious given that the same queries executed from a desktop SQL client returned under a second. Then I started thinking: “How does that prepare statement work?”. It turned out that in Sybase ASE prepared statements are compiled into a stored procedure. Which obviously takes time plus the number of input parameters (number of IDs) varies between invocations and that cripples the reusability. The solution I came up with was to fall back to use a simple java.sql.Statement with hard-coded values:

String sql = "SELECT * FROM TASK WHERE ID IN (" + join(",", ids) + ")";

With these modifications and proper indices, the queries return data in 50 – 60 milliseconds which is a 1000 fold improvement compared to the initial state – don’t take this the wrong way but I think it’s pretty cool.

One thing to note: parameterized prepared statements are often used to prevent SQL injection – or at least make really hard for hackers to pull off one. Using a simple statement is less secure but in this case the IDs are converted into Long values before they are passed down in the query. If we were to use String parameters we would have to take additional measures to make our code more secure.

To summarize my little story I think the takeaway message is that interfaces are good, but when it comes to performance optimization you have to brace yourself and look behind the curtain.