USERS AFFECTED:
All
PROBLEM DESCRIPTION:
Inserting large amount of multi-byte character data into a
CLOB column can result in data corruption. The result data in
the CLOB column in such case would contain characters 0xFC.
PROBLEM SUMMARY:
The problem can occur when working with large lob data (>128kB).
The lob data is processed in chunks as they are passed by the
client. When an internal lob processing buffer containg the
data chunks is full and there is still data to process, the lob
data is translated by codepage conversion functions, moved into
a temporary table and next chunk of data is processed. In this
particular defect situation the buffer full condition occurs in
the middle of the multi-byte character. The last byte in the
buffer is not valid multi-byte character so it suppose to be
remembered and used as the first character of the following
chunk. Due to a defect this dangling character is dropped,
causing all the next multi-byte characters in the following
chunk to be byte shifted.
Good data:
006A0061 00760061 002E006C 0061006E .j.a.v.a...l.a.n
0067002E 00530074 00720069 006E0067 .g...S.t.r.i.n.g
0022003E 003C0021 005B0043 00440041 .".>.<.!.[.C.D.A
Bad data (byte shifted):
74007200 69006E00 67002200 3E003C00 t.r.i.n.g.".>.<.
21005B00 43004400 41005400 41005B00 !.[.C.D.A.T.A.[.
63006F00 6D002E00 61007000 74007200 c.o.m...a.p.t.r.
Since these byte shifted characters may be invalid (according to
the used codepage), the the codepage conversion functions can
throw a warning like this:
SQLSTATE 01517: A character that could not be converted was
replaced with a substitute character.