巴西龟吃什么食物| 嘴角烂了是什么原因| zzegna是什么牌子价格| 小觑是什么意思| 昆仑山在什么地方| 整夜做梦是什么原因| 卦不走空是什么意思| 吃什么对皮肤好还能美白的| 发烧是什么感觉| 欧了是什么意思| 淋巴结有什么症状| flag是什么意思| 痛风该吃什么药好得快| 肠胃感冒是什么症状| 熬夜到什么程度会猝死| 关帝庙求什么最灵| 戒指丢了暗示着什么| 血氧仪是干什么用的| 容颜是什么意思| 甲状腺发炎有什么症状| suv什么意思| fnc是什么意思| 失眠吃什么食物效果最好| 从来就没有什么救世主| 为什么蝙蝠会飞进家里| 右小腿抽筋是什么原因| 肾虚吃什么好| 外阴白斑是什么引起的| 尾椎骨痛挂什么科| 咽炎吃什么药好| 来月经为什么会肚子痛| 淋巴细胞减少说明什么| 逆钟向转位是什么意思| 什么补蛋白最快的食物| 老觉得饿是什么原因| 防蓝光眼镜有什么好处| 尿酸高是什么原因导致的| 眉中间有竖纹代表什么| 尖锐湿疣吃什么药| 身体肿是什么原因引起的| 午餐吃什么好又能减肥| 温暖的近义词是什么| 餐后血糖高吃什么药| 膝盖小腿酸软无力是什么原因| 吃什么紧致皮肤抗衰老| 黄体酮有什么副作用| 贡高我慢是什么意思| 排卵期会有什么症状| 肚子咕噜咕噜响是什么原因| 香皂和肥皂有什么区别| 翻新机是什么意思| 鸡蛋加什么吃壮阳持久| 脾胃不好吃什么水果| 甲钴胺片主要治什么病| 痈疽是什么意思| 拘留是什么意思| 适合是什么意思| upc码是什么意思| 尿酸高适合喝什么茶| 做梦梦到踩到屎是什么意思| 腿肿脚肿是什么病的前兆| 什么目什么身| 海鲜不能和什么水果一起吃| 夏枯草长什么样| 经常掏耳朵有什么危害| 钙吃多了有什么副作用| 炀是什么意思| 锌是补什么的| 肾结晶是什么病| ppd试验是什么意思| 耳朵发炎用什么药| 狗冲什么生肖| 破釜沉舟是什么生肖| 孕妇血糖高有什么症状| 老打瞌睡犯困是什么原因| 6月9日是什么星座| 艾滋病窗口期是什么意思| 什么东西不导电| 电头是什么| 维生素d3和d2有什么区别| 大腿为什么会长妊娠纹| 突然好想你你会在哪里是什么歌| 梦见摘黄瓜是什么意思| 什么绿什么红| 高烧拉肚子是什么原因| 掉头发去医院看什么科| 宗是什么意思| 猫咪疫苗什么时候打| 尿隐血阳性什么意思| 淋巴细胞偏低什么意思| 50岁今年属什么生肖| 预科班什么意思| 吃天麻对身体有什么好处| 今年什么时候入梅| 肾虚吃什么食物| 婴儿吃什么奶粉好吸收| 世外桃源什么意思| 眼睛无神呆滞什么原因| 今天是什么生肖日| 什么名字| 兰花象征着什么| 世界第一大运动是什么| 亲故是什么意思| 反绒皮是什么材质| 肌酐低什么原因| 知趣是什么意思| 7月15是什么星座| 血糖高可以吃什么肉类| ti是什么元素| 萎缩性胃炎吃什么食物好| 孔子名叫什么| 空调健康模式是什么意思| 早泄是什么原因| 倒吊人是什么意思| 漂流需要带什么| 力排众议是什么意思| 巫师是什么意思| 子宫病变有什么症状| 突然的反义词是什么| 东营有什么大学| 馒头是什么做的| 什么是神经性皮炎| 空代表什么生肖| 儿童经常头晕什么原因导致的| 真菌涂片检查是查什么| 孩子咳嗽有痰吃什么药| 图字五行属什么| 常德有什么大学| 磁共振是检查什么| 头颅mri是什么检查| 大乌龙是什么意思| 什么分什么裂| 歆是什么意思| 为什么心脏会隐隐作痛| 麸炒是什么意思| 畏首畏尾是什么意思| 见招拆招下一句是什么| 甲亢平时要注意什么| 低筋面粉是什么| 梦见进监狱是什么兆头| 孕妇喝什么牛奶对胎儿好| 减肥什么方法最快最有效| 什么情况下要打破伤风针| ifound是什么牌子| 骨化性肌炎是什么病| m是什么意思| 晨尿茶色是什么原因| 为什么做噩梦| 一览无余是什么意思| 经期吃什么缓解痛经| 浮瓜沉李什么意思| 213什么星座| 吃什么长卵泡| 鱼有念什么| mhc是什么意思| 什么是性骚扰| 什么年龄割双眼皮最好| 副军级是什么级别| 什么东西可以代替阴茎| 大姨妈来了能吃什么水果| 夜里2点到3点醒什么原因| 一月10号是什么星座| 中秋节干什么| 儿童正常体温在什么范围| 云吞是什么| 这个季节吃什么水果| 眼底出血用什么眼药水| 中药什么时间喝效果最好| 宝五行属什么| 口干舌燥是什么意思| 什么门比较好| 冠脉ct能检查出什么| 什么叫便秘| 卫校有什么专业| 小暑是什么时候| 胰腺炎能吃什么| 葛根有什么功效| 柔肝是什么意思| 老放屁是什么病的征兆| 小便短赤是什么症状| 鸡蛋和什么炒好吃| 例假淋漓不尽是什么原因造成的| 胸骨后是什么位置图| 西瓜对人体有什么好处| 记录是什么意思| 脚底脱皮是什么原因| 梅毒螺旋体抗体阴性是什么意思| 酒喝多了喝什么解酒| 阴灵是什么意思| 肛门疼痛是什么原因| 奇异果和猕猴桃有什么区别| 正月初二是什么星座的| 女属蛇的和什么属相最配| 下午一点是什么时辰| 复机是什么意思| 二月十八是什么星座| 十二指肠溃疡是什么症状| 流鼻涕吃什么药最管用| 今天是美国什么节日| 鹅肉炖什么好吃又营养| 反刍是什么意思| 淋巴细胞数偏高是什么意思| 卟啉症是什么病| 灰蓝色是什么颜色| 山茶花是什么颜色| 妈妈的外婆叫什么| 泌尿内科主要看什么病| 花笺是什么意思| 默哀是什么意思| 慢性宫颈炎用什么药好| 糖类抗原125是什么意思| 酸奶和牛奶有什么区别| 92年属猴的是什么命| 梦见狗咬人是什么预兆| 撩 是什么意思| 卜在姓氏里读什么| 云母是什么东西| st-t改变是什么意思| 兔子爱吃什么| 孕晚期破水是什么症状| 如法炮制是什么意思| 茯苓和茯神有什么区别| 中伤是什么意思| 腿长身子短有什么说法| 支气管哮喘是什么原因引起的| 心火旺吃什么中药| 悠闲惬意什么意思| 胸胀痛什么原因| 5月19号是什么星座| 蒙羞是什么意思| 夏天适合种什么植物| 骑马野战指什么生肖| 口腔溃疡挂什么科| 翼字五行属什么| 无花果什么时候种植| 禁的拼音是什么| 吃猪血有什么好处和坏处| 喝酒不能吃什么东西| 关节发黑是什么原因| 早上起床吐痰带血是什么原因| 扩心病是什么病| 布拉吉是什么| 心动过缓吃什么药| 高铁列车长是什么级别| 奶奶的妈妈应该叫什么| 阿司匹林什么时间吃最好| 它是什么结构| 谷草谷丙偏高是什么原因| 客厅沙发后面墙上挂什么画好| 两票制指的是什么| 血管变窄吃什么能改善| 风湿性关节炎挂什么科| 型男是什么意思| 胃底腺息肉什么意思| gigi是什么意思| 肠胃炎吃什么药好得快| 脚上起水泡是什么原因| 女生右手食指戴戒指什么意思| 扒是什么意思| 中意你是什么意思| 俞伯牙摔琴谢知音摔的是什么乐器| 梅毒检查什么项目| 狮子座上升星座是什么| 3岁打什么疫苗| 百度

Sunday, March 23, 2014

What I think of jsonb

Unsurprisingly, there has been a lot of interest in the jsonb type, which made it into the upcoming 9.4 release of Postgres. I was initially a reviewer of jsonb, although since I spent weeks polishing the code, I was ultimately credited as a co-author.

Jsonb is a new datatype for Postgres. It is distinct from the older json datatype in that its internal representation is binary, and in that it is internally typed. It also makes sophisticated nested predicates within queries on jsonb indexable.  I've occasionally described the internally-typed scalar values as having “shadow types” unknown to the core SQL parser. This has several implications. For example, if you sort two Jsonb values containing only scalar numbers, the implementation invokes the numeric comparator (which the jsonb default B-Tree opclass comparator is defined in terms of). The on-disk representation of jsonb includes the same representation as is used for, say, numerics (as the internal binary representation of JSON primitive numbers, for example). Plus, JSON objects are de-duplicated by key on input, and optimized for cheap binary searches within a single jsonb. Still, like the earlier json type, jsonb in every sense “speaks JSON”. There are some limitations on what can be represented as a jsonb number, but those are exactly the same limitations that apply to the core numeric type (plus some limitations imposed by the JSON RFC, such as not accepting NaN values). I hope it suffices to say that these limitations are virtually irrelevant, and that many implementations have similar or worse limitations. All of these minor implementation-defined restrictions are explicitly anticipated and allowed for by the recent JSON RFC-7159.



Jsonb is emphatically not like the BSON format used by MongoDB. That format accepts input in such a way as to be backwards compatible with JSON, but I believe that BSON isn't really a practical interchange format, because the software development community at large is presumably disinclined to buy into an interchange format that as yet is not described by any RFC, or any communiqué of a recognized standards body. In contrast, jsonb is a datatype that will only ever output valid textual JSON, and will only ever accept valid textual JSON (subject to the aforementioned obscure and practically irrelevant restrictions, and the caveat on automatically normalizing duplicate-keyed pairs within the same object). Jsonb also imposes an internal ordering on object pairs. Again, this is all anticipated and allowed for by the JSON RFC.

It's possible that I'm mistaken, and that BSON or something else will emerge as an actual standard (either de facto or de jure), since I've heard that there is support in the works for database systems other than MongoDB. It's not impossible that pursuing something like BSON might be an interesting future direction for Postgres, since for one thing BSON supports more than the 4 standard JSON primitive types. In any case it's important to note that the protocol or on-disk binary representation of jsonb is an implementation detail; we're not in competition with BSON, and this isn't a new standard. It's just a new Postgres datatype, with new indexing capabilities. I think it's notable that BSON doesn't have a JSON-style universal number type. It has 32-bit and 64-bit integer types, and double precision 64-bit IEEE 754 floating point numbers. It strikes me that this omission tells me all I need to know about binary interchange formats.

To understand how the jsonb type works in more detail, I suggest taking a look at the devel documentation. It's worth taking a close look at containment semantics, since that's the really compelling way of searching through jsonb documents.

Strategic significance


There has been a little back and forth among senior community members about the significance of jsonb. Josh Berkus wrote that he thought it was the most important 9.4 feature. Robert Haas was skeptical, preferring the logical decoding stuff. I've even seen one or two people in the comments section of various news articles grumble about Postgres jumping on the JSON bandwagon.

I have to admit that relatively speaking, jsonb is not in and of itself all that technically complex. While it is a great feature, and while I think it puts Postgres in a very competitive position relative to certain other systems, it would be almost trivial to ship a jsonb extension that works with earlier versions of Postgres. However, without taking a position on what the best 9.4 feature is going to be (I like both jsonb and logical decoding, and contributed in various ways to both), I think that it's possible that Josh Berkus and Robert Haas are both more or less right at the same time, and their apparent disagreement reflects only their individual priorities for Postgres.

It is very much to the credit of the principle jsonb authors, Oleg Bartunov and Teodor Sigaev, that with some help from Andrew Dunstan and myself they managed to define what I think is internally a solid nested, strongly-typed format for us to build on, with a textual output representation that just so happens to be the same one that has emerged as a standard for this kind of thing. But, to me, as a Postgres hacker, their previous work – and the previous work of Alexander Korotkov (who, due to an unfortunate oversight, was not credited in the jsonb commit message) – is the real story here. As the authors of GIN, Oleg and Teodor are perhaps most responsible for the foundation on which jsonb is built, a foundation built over many years. Alexander's excellent recent work on improving the GIN access method (with help in various areas from Heikki Linnakangas), which also made it into 9.4, is probably what will end up making jsonb really shine. This includes compression of GIN posting lists,  speeding up "rare & frequent" type GIN queries, multi-key GIN search skipping, and further optimizations to multi-key searching. Alexander had earlier reported some very impressive improvements in PostgreSQL full-text search performance as a result of all of this, with performance apparently competing with that of external systems like Sphinx and Solr. It is likely that many of the same big performance improvements seen there concomitantly benefit the jsonb GIN opclasses.


jsonb_path_ops


Having said that, Alexander's jsonb_path_ops alternate GIN operator class, which was his contribution to the big jsonb patch deserves an honorable mention. By combining GIN with hashing of either key/value pairs, or array elements, resulting indexes can give great performance for sophisticated “containment” type queries against JSON documents. Indexes are a fraction of the size of the data indexed, index scans are incredibly fast, and yet these GIN indexes make indexable very complex nested “containment” queries. The results are so impressive that at last November's pgConf.EU conference, an EXPLAIN ANALYZE comparative example in a presentation given by Oleg and Teodor was greeted with sustained applause.

I'm really pleased that we worked towards making all of this as beneficial as possible to the largest possible number of people, but even as it puts Postgres in a very competitive position with respect to some non-relational systems, jsonb does not really represent any kind of pivot towards Postgres as a document store – Postgres has always been an object-relational system. Rather, I think it is one particular outcome of a much bigger process that has been underway for many years.

I'll watch the future development of the "VODKA" index access method with interest, because at this early stage it is my understanding that it's intended to make searching nested, heterogeneous structures more flexible and better performing still. It seems likely that there will be a number of other applications for that infrastructure too, since like GiST, GIN, and SP-GiST, it is intended to be an extensible infrastructure that serves many analogous needs in a general way.

6 comments:

  1. I very like your notice, that jsonb is a consequence of our (I and Teodor Sigaev) more than decade development - intarray, tsearch, hstore, ltree, full text search, pg_trgm, GiST, GIN, SP-GiST are all about semi-structured data.

    ReplyDelete
  2. Congrats to the Postgres team. jsonb is an exciting feature and I'm looking forward to it! Thanks for all your hard work!

    ReplyDelete
  3. This is indeed going to be a cool feature. We made tests with a different bson implementation and PostgreSQL and it was much faster than MongoDB, so I expect about the same speed-up. In many use cases, people will be much better off with PostgreSQL then.

    What I don't really understand is the roadmap regarding jsonb and hstore. Is jsonb going to be the ultimate type for document-style data? Will it supersede hstore or will they coexist? As I read in a presentation from Oleg, hstore in (probably) 9.4 has similar features to offer.

    ReplyDelete
  4. hstore is going to continue to be useful as a simple key/value store, but I don't think it's likely to expand in scope, given the discussion on list. jsonb is in a certain sense the successor of hstore, given the technical foundation that they share, but that's about the extent of it.

    Originally, we thought that hstore could be made to support multiple input formats, an hstore compatible one, and a JSON one, but during our discussion it transpired that that probably wasn't worth it. Personally, I'm happy to let hstore be hstore. I think that jsonb is definitely the area that we'll target for future improvements to general handling of semi-structured or unstructured data.

    ReplyDelete
  5. I hope all of those TFS performance improvements from 2012 went into the PostgreSQL 9.4 codebase.

    If they are, they should get a lot more attention. They should at least be mentioned in the PostgreSQL 9.4 release notes.

    ReplyDelete
  6. Perhaps you would know the answer to a jsonb question: TOAST does not seem to do any compression on this format, even for large json docs. Is that by design?

    ReplyDelete

部队班长是什么军衔 东北方向五行属什么 神经内科看什么病的 处女座与什么星座最配 钊读什么
为什么丰胸霜一抹就变大 蓝猫为什么叫蓝猫 脾肾阳虚是什么意思 十一月十一号是什么星座 员工体检费计入什么科目
莒姬是什么位分 感悟是什么意思 常吃大蒜有什么好处 电子邮件地址是什么意思 什么生木
硬脂酸是什么 刮痧和拔罐有什么区别 真菌感染什么症状 血氧饱和度什么意思 外耳道耵聍什么意思
青稞是什么hcv9jop0ns1r.cn 指腹为婚是什么意思bjhyzcsm.com 经常吃南瓜有什么好处和坏处hcv9jop2ns5r.cn dha不能和什么一起吃hcv8jop6ns4r.cn 阿昔洛韦乳膏治什么病mmeoe.com
逐年是什么意思hcv8jop1ns0r.cn 伶人是什么意思hcv7jop7ns2r.cn 什么叫发物hcv7jop9ns6r.cn 5.29什么星座hcv8jop4ns9r.cn 阳痿吃什么药dayuxmw.com
蒸鱼豉油可以用什么代替hcv9jop3ns9r.cn 李小龙和丁佩什么关系hcv7jop5ns3r.cn 置换补贴什么意思tiangongnft.com 暗合是什么意思96micro.com 得逞是什么意思dayuxmw.com
成也萧何败也萧何什么意思hanqikai.com 大理有什么好玩的hcv9jop6ns2r.cn 3月15是什么星座hcv9jop2ns7r.cn 肾结石是什么原因造成的hcv7jop6ns8r.cn 2月2号什么星座hcv8jop5ns7r.cn
百度