Implement a method to perform basic string compression using the counts of repeated characters

QUESTION:

String Compression: Implement a method to perform basic string compression using the counts of repeated characters. For example, the string aabcccccaaa would become a2blc5a3. If the “compressed” string would not become smaller than the original string, your method should return the original string. You can assume the string has only uppercase and lowercase letters (a – z).

SOLUTION:

  • At first glance, implementing this method seems fairly straightforward, but perhaps a bit tedious.
  • We iterate through the string, copying characters to a new string and counting the repeats.
  • At each iteration, check if the current character is the same as the next character. If not, add its compressed version to the result.

How hard could it be?

String compressBad(String str) {
String compressedString = "";
int countConsecutive = 0;
for (int i= 0; i < str.length(); i++) {
countConsecutive++;

/* If next character is different than current, append this char to result.*/
if (i + 1 >= str.length() I I str.charAt(i) != str.charAt(i + 1)) {
compressedString += "" + str. charAt ( i) + countConsecuti ve;
countConsecutive = 0;
}
}
return compressedString.length() < str.length() ? compressedString str;
}

This works. Is it efficient, though? Take a look at the runtime of this code.

  • The runtime is O (p + k2 ), where p is the size of the original string and k is thel number of character sequences. For example, if the string is aabccdeeaa, then there are six characte sequences. It’s slow because string concatenation operates in O(n2) time.

We can fix this by using a StringBuilder.

String compress(String str) {
StringBuilder compressed= new StringBuilder();
int countConsecutive = 0;
for (int i= 0; i < str.length(); i++) {
countConsecutive++;

/* If next character is different than current, append this char to result.*/
if (i + 1 >= str.length() I I str.charAt(i) != str.charAt(i + 1)) {
compressed.append(str.charAt(i));
compressed.append(countConsecutive);
countConsecutive = 0;
}
}
return compressed.length() < str.length() ? compressed.toString() : str;
} 

Both of these solutions create the compressed string first and then return the shorter of the input string and the compressed string.

  • Instead, we can check in advance. This will be more optimal in cases where we don’t have a large number of repeating characters.
  • It will avoid us having to create a string that we never use. The downside of this is that it causes a second loop through the characters and also adds nearly duplicated code.
String compress(String str) {
/* Check final length and return input string if it would be longer. */
int finallength = countCompression(str);
if (finallength >= str.length()) return str;

StringBuilder compressed= new StringBuilder(finalLength); // initial capacity
int countConsecutive = 0;
for (int i= 0; i < str.length(); i++) {
countConsecutive++;

/* If next character is different than current, append this char to result.*/
if (i + 1 >= str.length() I I str.charAt(i) != str.charAt(i + 1)) {
compressed.append(str.charAt(i));
compressed.append(countConsecutive);
countConsecutive = 0;
}
}
return compressed.toString();
}

int countCompression(String str) {
int compressedlength = 0;
int countConsecutive = 0;
for (int i= 0; i < str.length(); i++) {
countConsecutive++;

/* If next character is different than current, increase the length.*/
if (i + 1 >= str.length() I I str.charAt(i) != str.charAt(i + 1)) {
compressedlength += 1 + String.valueOf(countConsecutive).length();
countConsecutive = 0;
}
}
return compressedlength;
} 
  • One other benefit of this approach is that we can initialize StringBuilder to its necessary capacity up-front.
  • Without this, StringBuilder will (behind the scenes) need to double its capacity every time it hits capacity.
  • The capacity could be double what we ultimately need.

Leave a Comment