otmfaqForumsBlogsRegister
FAQMembers ListCalendarToday's PostsSearch


 Subscribe Blogs:RSS
 Subscribe Forums:RSS
OTMFAQ Home
OTMFAQ Blogs
OTMFAQ Forums
OTMFAQ Tutorials

OTM SIG
MavenWire


Data Loading Loading data into OTM / G-Log through CSVs, XML, and the UI.

Tags: , ,

Closed Thread
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old June 14th, 2007, 04:00
Junior Member
 
Join Date: May 2007
Posts: 19
Thanks: 0
Thanked 0 Times in 0 Posts
Groans: 0
Groaned at 0 Times in 0 Posts
Rep Power: 0
engyeowkee is on a distinguished road
[SOLVED] CSV and UTF8 Encoding

Hi guys,

We're trying to upload data in Chinese characters but it seems like CSV doesn't support UTF8 encoding, so has anyone been able to upload double byte (Asian languages) data into OTM using CSV? Or must we use XML? Thanks!

Kee
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
  #2 (permalink)  
Old June 14th, 2007, 11:11
Senior Member and Blogger
 
Join Date: Dec 2006
Location: Singapore
Posts: 130
Blog Entries: 5
Thanks: 4
Thanked 8 Times in 7 Posts
Groans: 0
Groaned at 0 Times in 0 Posts
Rep Power: 2
ianlo is on a distinguished road
Send a message via AIM to ianlo Send a message via Skype™ to ianlo
Re: CSV and UTF8 Encoding

Hi Kee,

If you are doing it on OTM 5.0, you should not be hitting this problem - Metalink is the next option. Otherwise if it is OTM 5.5 you are loading into then read on!

Yes you can load UTF-8 CSV files (in 5.0 and 5.5). However, are you doing this in Windows using Notepad or Excel? If you are, then you will have problems loading it into OTM 5.5 because Windows automatically pads the BOM characters (3 bytes) into the beginning of the file.

You need to strip the BOM characters first before loading the file via CSV upload.

I have been loading UTF-8 CSV files on 5.0 regularly so it should not be an issue but when we started using 5.5, the BOM gave us quite a bit of headaches.

Ps: I have passed a tool called Unicsved to Simon. This is a tool that will allow you to save unicode text files (tab delimited etc) to CSV format for loading

Last edited by ianlo : June 14th, 2007 at 11:33.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
  #3 (permalink)  
Old June 15th, 2007, 01:09
Junior Member
 
Join Date: May 2007
Posts: 19
Thanks: 0
Thanked 0 Times in 0 Posts
Groans: 0
Groaned at 0 Times in 0 Posts
Rep Power: 0
engyeowkee is on a distinguished road
Re: CSV and UTF8 Encoding

Hi Ian,

Indeed we're trying to upload UTF-8 CSV in 5.5, thank you so much for your help!

Kee
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
  #4 (permalink)  
Old June 15th, 2007, 01:35
Senior Member and Blogger
 
Join Date: Dec 2006
Location: Singapore
Posts: 130
Blog Entries: 5
Thanks: 4
Thanked 8 Times in 7 Posts
Groans: 0
Groaned at 0 Times in 0 Posts
Rep Power: 2
ianlo is on a distinguished road
Send a message via AIM to ianlo Send a message via Skype™ to ianlo
Re: CSV and UTF8 Encoding

Hi Kee,

No problem. This also applies to XML files btw. If you need a BOM stripper for XML, you can download Xerces from apache.org and compile their example programs. There is a DOMPrint program that can remove the BOM.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
  #5 (permalink)  
Old June 18th, 2007, 03:02
Junior Member
 
Join Date: May 2007
Posts: 19
Thanks: 0
Thanked 0 Times in 0 Posts
Groans: 0
Groaned at 0 Times in 0 Posts
Rep Power: 0
engyeowkee is on a distinguished road
Re: CSV and UTF8 Encoding

Hi Ian,

Perhaps i'm not using the tool correctly as I'm not able to preserve the the chinese char despite encoding it in UTF-8 w/o BOM, below is the process I went thru,

1. Save LOCATION.txt (exported from OTM) in as text file UTF-8 using Notepad
2. Open LOCATION.txt using uniCSVed and encode it in UTF-8 wo BOM, save it as a CSV file.
3. Open the LOCATION.csv file in Excel to add/format data to be uploaded
4. Save the LOCATION.csv in Excel (all chinese char turn into after the save)

Any thoughts/suggestions, thanks!

kee
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
  #6 (permalink)  
Old June 18th, 2007, 03:15
Senior Member and Blogger
 
Join Date: Dec 2006
Location: Singapore
Posts: 130
Blog Entries: 5
Thanks: 4
Thanked 8 Times in 7 Posts
Groans: 0
Groaned at 0 Times in 0 Posts
Rep Power: 2
ianlo is on a distinguished road
Send a message via AIM to ianlo Send a message via Skype™ to ianlo
Re: CSV and UTF8 Encoding

Hi Kee,

The problem is that Excel cannot save csv in UTF-8 format. It can only save UTF-8 as a tab delimited file. (save as Unicode)

You should save your modified LOCATION.csv file as a tab delimited UTF-8 text file and then use UniCSVed to save it as a CSV w/o BOM.

Hope this helps!

You can call me if you need any help or look me up in Skype

Ian
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Closed Thread



Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT. The time now is 11:51.
Copyright © 2008, Open Book Solutions LLC. All rights reserved.

Sponsored by MavenWire - MavenWire.com


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36