The Problem:
Couchbase buckets can accept any valid JSON i.e. JSON objects, JSON array, scalar values (string, boolean, numbers and null). However, the Analytics service only accepts JSON objects. Binary documents are also ignored because you cannot query binary data. For example this document is ignored by Analytics:
[1,2,3]
since it is a JSON array.
The following document will be part of the dataset:
{
"numbers": [1,2,3]
}
since it is a JSON object.
Therefore, if the bucket has documents that are not JSON objects, the Analytics dataset will have fewer documents than the bucket (assuming no filters are configured on the dataset).
Versions Impacted:
At the time of this writing it impacts all versions of Couchbase. There is an open feature request to address this limitation.
Workaround:
Unfortunately the only viable workaround is for the application to write JSON objects. It might be useful to identify documents that are not JSON objects. See below for steps to identify these documents.
How do you identify documents that aren’t JSON objects?
If you have Query/Index nodes create the following index:
CREATEINDEX idx_non_objects ONdefault(TYPE(SELF))
WHERETYPE(SELF) <> 'object';
and run this query:
SELECT DISTINCT RAW META(d).id
FROMdefault d
WHERETYPE(d) <> 'object';
If you don’t have Query/Index nodes you can use views. Here is the map function:
function (doc, meta) {
if(doc && doc.constructor !== Object) {
emit(meta.id, null);
}
}
Note that the above query and map function will include keys for binary documents.
Useful links:
https://www.json.org/json-en.html
https://docs.couchbase.com/server/6.6/n1ql/n1ql-language-reference/typefun.html
https://docs.couchbase.com/server/6.6/learn/views/views-writing.html
Comments
0 comments
Article is closed for comments.