Monthly Archives: June 2021

Extracting files from a ProgressCRM database

We recently had a user who needed to extract files from a SQL Server database created by ProgressCRM and needed help.  He was helpful enough to send us a couple of the sample files that were not extracted correctly.

It turned out that each file that was being uploaded was stored in an OLE container.  Normally, our products (SQL Image Viewer, Access OLE Export, and SQL Blob Export) can extract the embedded files, but in this case, the entire OLE container itself had a 12-byte prefix, most likely added by ProgressCRM for their internal use.

So the task was to first extract the content without the 12-byte prefix, and then let SQL Blob Export handle the extraction of the files from the OLE container.

Luckily, this was easy enough to do in a SQLServer database.  We could use the SUBSTRING command to retrieve only the content from the 13-th byte onwards e.g.

SELECT SUBSTRING(CAST(<the column containing the binary data> AS VARBINARY(MAX)), 13, DATALENGTH(content))  FROM <the table name>

So if you find that your application is uploading your files with their own prefix data, you can use the SUBSTRING function (in SQL Server) to remove the prefix so that our products can recognize the actual file content and export them for you correctly.

Improved debugging in Access OLE Export / SQL Blob Export

Sometimes it happens that you may have some items that Access OLE Export or SQL Blob Export is unable to export correctly.  This can be due to many reasons e.g.

  • the binary data has been compressed by the uploading application
  • the binary data has been encrypted by the uploading application
  • the uploading application has appended additional header bytes to the data
  • an unsupported OLE server was used to upload the file

To troubleshoot these issues, we would ask the user to send us the raw data exactly as stored in the database.  This usually involves using SQL Image Viewer to perform the extraction.

We’ve now added the option to extract the raw data using Access OLE Export / SQL Blob Export.  You can do this by selecting the debugging – create bin files for analysis item.

Now when you have problems exporting your files, selecting this option will cause a raw copy of the data to be exported together, with the .debug.bin extension.  You can then send us these .debug.bin files for further analysis.

Using UTF8Tool

UTF8Tool is an application to encode files to the UTF-8  character set and vice-versa.  You can download the application here.  This application was originally created for a user who needed to decode/encode files stored in a text column in his database.

To use the application, select the option to either decode or encode the files.  Then enter a search pattern in the Input folder option to locate your files.  If we wanted to process only pdf files, we could enter something like this:

Next, enter the folder to store the processed files in, in the Output folder option e.g.

Once you’ve entered the values, click on the Start button to process your files.  Note that any existing files in the output folder will be overwritten.

Extracting binary data from a nvarchar(max) column

Recently we had a user that needed to extract PDF files stored in a SQL Server nvarchar(max) column.  Now, you wouldn’t normally be able to store binary files correctly in a text column, but the application they used (Deacom) first encoded the PDF file to UTF-8 text, then stored the UTF-8 text into the text column.

While SQL Image Viewer (and SQL Blob Export/Access OLE Export) could extract the UTF-8 encoded data from the table and store it in a file, the resulting PDF file cannot be opened because the data contained therein is invalid.  So we had to provide a small utility to convert the UTF-8 encoded file back to its original contents.  You can read more about the utility here.

After converting the PDF files, the user then compressed the PDF files and now wanted to upload the PDF files back into the database.  They used SQL File Import to perform the upload, but first the PDF files had be to UTF-8 encoded again, otherwise the application (Deacom) could not open the PDF files.  That can also be one using the above utility.

Finally, we had one last issue where only 8 Kb of the encoded PDF file was being uploaded.  The cause was traced to the use of the MDAC/ODBC driver to connect to SQL Server.

In our database products, when you connect to a SQL Server instance, the default is to let the application select the best available drivers.

By design, our products will use the SQL Server Native Client drivers if available, and only use MDAC/ODBC drivers when the Native Client drivers are not available.  However, MDAC/ODBC drivers limit text fields to only 8000 bytes.

When the user attempted to use the Native Client driver, the connection failed.

Installing and using the SQL Server Native Client drivers then resolved the connection and upload issue.