Building Really Long Strings

Dynamically building strings with Gupta is actually pretty fast – at least if the resulting string is below 1,000,000 characters long. The need for the composition of strings beyond 32kByte is also low: SQL statements are usually limited to 32 KB length, otherwise the input buffer is complaining. The multiline text box is limited to this length also. Row-oriented file exports consist rarely of extremely long rows of data.

Concatenate can take some time

But, when serializing data in XML or JSON format, it can happen that one has to build very long strings of about 1-10 MB of characters when you cant write parts to disk inbetween. What you do ist to concatenate short stringparts to a large output string  (for example, an XML tag with content and end tag). From a certain length on the time required to append additional strings is getting extreme. Building a string with up to 500,000 characters takes about two seconds on an average desktop computer. A million characters take up to 40 seconds. For ten million characters, I had no more patience. That had to be optimized. Because in itself, Gupta has no problem with very large strings: one can read files with tens of megabytes in size with SalFileRead(…) as a BLOB. But frequently concatenating short strings can become very time consuming.

Analysis

For a quantitative analysis I have written a small program. In a loop I append short strings of length 10 to sOut. Once per 100 passes the time is measured and exported to a CSV file. The result is plotted in the following diagram:

StringAppend Time Length Diagram

You can see a sharp bend at stringlength 500,000 characters from which on adding further stringparts takes much longer (in debug mode the bend is located at 100,000 characters).

Accelerator

Adding stringparts to a temporary string which will be concatenated to the output string when it reaches a length of N characters,  you can speed up the process as a whole. I have determined the optimal value for N through a series of tests. It lies between 10,000 and 20,000 as can be seen in the following graphic. To create a 10 MByte string from a total of one million string attachments takes only 13 seconds.
StringBuilder Performance Run

StringBuilder Performance Run

StringBuilder

I packed the behavior described above in a class StringBuilder with the following methods: Append(…) appends a stringpart, ToString() returns the output string. Lenght() returns the current length of the output string  and clear() resets it. You can download the source code of StringBuilder including the performance tests.

Download

StringBuilderPerfomanceTests.zip

Happy coding

Advertisements

Über thomasuttendorfer
Ich bin Entwicklungsleiter bei der Softwarefirma [ frevel & fey ] in München. Wir entwickeln Business-Software für Verlage und verwenden dafür den Gupta Team-Developer sowie Visual Studio.

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden / Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ Foto

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s

%d Bloggern gefällt das: