Problem
I was trying to send JSON data through a URL and it was trickier than I thought.I was getting "bad URL" returned from the receiving sever because there were characters in the data which were interfering with the rest of the JSON record that I was trying to post. I needed to strip these out.
I tried a few things that I thought looked promising including UTL_URL.escape.
I also went through tons of forums and the below function ALMOST made the cut, but it couldn't handle large fields.
with trans_tbl(ch_frm, str_to) as (
select '"', '\"' from dual union
select '/', '\/' from dual union
select '\', '\\' from dual union
select chr(8), '\b' from dual union -- BS
select chr(12), '\f' from dual union -- FF
select chr(10), '\n' from dual union -- NL
select chr(13), '\r' from dual union -- CR
select chr(9), '\t' from dual -- HT
),
inp_str(txt) as (
select string_in from dual
)
select SUBSTR(max(replace(sys_connect_by_path(ch,'`'),'`')),1,32000) as c_text
into s_converted
from (
select lvl
,decode(str_to,null,substr(txt, lvl, 1),str_to) as ch
from inp_str cross join (select level lvl from inp_str connect by level <= length(txt))
left outer join trans_tbl on (ch_frm = substr(txt, lvl, 1))
)
connect by lvl = prior lvl+1
start with lvl = 1;
Solution
I settled on this where b.question_text is a clobregexp_replace(REGEXP_REPLACE(ASCIISTR(substr(replace(b.question_text,'"',''),1,3500)), '\\[[:xdigit:]]{4}', ''),'<.*?>')
The data was coming from a MSSQL database so "asciistr" ensures that we don't have any strange characters
It works in conjunction with '\\[[:xdigit:]]{4}' to strip out the literals returns by asciistr.
Finally, another egexp_replace, this time with "<.*?>", which trips out all of the HTML tags in my data.
The 3500 character limit was necessary in my case, but I'm not sure if this was just a restriction on my receiving server.
Acknowledgment
http://psoug.org/definition/ASCIISTR.htm
http://stackoverflow.com/questions/2236475/finding-and-removing-non-ascii-characters-from-an-oracle-varchar2
Comments
Post a Comment