Monday, August 16, 2010

Replacing special characters from SOAP object / DB string

Finding the special characters in XML / SOAP / Database string sometimes it is painful. I've recently come across similar situation where user has copied the text from Word document and paste in Rich text editor / text editor. Error has occurred while processing the text to deserialize / saving it to database.
Code below is used to replace some known special character but you can keep adding the list if you wish.
    // Unicode function
    public static string GetUnicodeString(string s)
    {
          byte[] unicodeByte = Encoding.Unicode.GetBytes(s);
          string unicodechars = (Encoding.Default.GetString(unicodeByte)).ToString().Replace("\0", "").Replace("\f", "").Replace("", "");
          return (unicodechars);
    }

    public static string GetUnicodeString(string s, bool replaceQuotes)
    {
      //Replace MS Word quotes with standard quotes (Oracle seems to store only 7-bit ASCII)
      if (replaceQuotes)
      {
        if (s.IndexOfAny(new char[] { (char)8220, (char)8221, (char)8217, (char)8216 }) != -1)
        {
          s = s.Replace((char)8220, '\'').Replace((char)8221, '\'').Replace((char)8217, (char)34).Replace((char)8216, (char)34);
        }
      }
      return (GetUnicodeString(s));
    }
  }

Hope this helps.

Development Continues...

No comments: