Working with salesforce rich text fields

I’ve been doing quite a lot of work with the rich text fields salesforce introduced back in APIv21. While I’ve had them in our app for a while and used them all over I’ve never had to worry about importing data into them before - it’s all been created in-app. While our in-app data creation has been a pretty awesome experience on the whole when it came down to importing data recently I was stumped by the dead silence on the matter.

First came the request to import a number of legacy word docs into salesforce rich text format, which if you’re using a modern browser on OS X actually works via copy/paste quite seamlessly - or so I thought in my limited trials. It turns out that the metadata and excessively long style descriptions word uses are preserved by the salesforce FCKeditor instance (what actually powers their input UI). I hit a wall - the 32k character limit on fields with just a handful of images included in a under 500 word document. I had figured. Inspecting source for a single image made it quite clear where the issue is:

<span class="Apple-style-span" style="font-size: x-small;">&lt;p class="MsoNormal"&gt;&lt;span style="font-family:Times;mso-bidi-font-family:Times;</span>
<span class="Apple-style-span" style="font-size: x-small;">mso-no-proof:yes"&gt;&lt;!--[if gte vml 1]&gt;&lt;v:shape id="Picture_x0020_29" o:spid="_x0000_i1029"</span>
<span class="Apple-style-span" style="font-size: x-small;"> type="#_x0000_t75" style='width:2in;height:219pt;visibility:visible;</span>
<span class="Apple-style-span" style="font-size: x-small;"> mso-wrap-style:square'&gt;</span><span class="Apple-style-span" style="font-size: x-small;"> &lt;v:imagedata src="file://localhost/Users/cpeterson/Library/Caches/TemporaryItems/msoclip/0clip_image055.jpg"</span>
<span class="Apple-style-span" style="font-size: x-small;">  o:title=""/&gt;</span><span class="Apple-style-span" style="font-size: x-small;">&lt;/v:shape&gt;&lt;![endif]--&gt;&lt;!--[if !vml]--&gt;&lt;img width="146" height="221" src="file://localhost/Users/cpeterson/Library/Caches/TemporaryItems/msoclip/0clip_image056.png" v:shapes="Picture_x0020_29"&gt;&lt;!--[endif]--&gt;&lt;/span&gt;&lt;span style="font-family:Times;</span><span class="Apple-style-span" style="font-size: x-small;">mso-bidi-font-family:Times"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;</span>

Ouch, so copy/paste from word is out. I had to tell a customer they get to rewrite all of their word documents from scratch in the salesforce rich text editor. That didn’t go over well. To rub salt in the wound it looks like CKeditor, which superceeded FCKeditor in early 2010 does actually strip out this formatting, but salesforce has yet to upgrade to it. So close, yet so far!

Today I was looking at importing some data for that same customer from their sandbox to production org and again I hit a wall, it’s apparently not possible to export/import rich text image data between orgs!

Looks like I’m in for a fun evening of copy and pasting.