Bigtable
BigTable是一種壓縮的、高效能的、高可擴展性的,基于Google檔案系統(Google File System,GFS)的数据存储系统,用於儲存大规模結構化数据,適用於雲端計算。
BigTable發展於2004年[1],現今已成為Google的應用程式。像是MapReduce就常透過BigTable來儲存或更改資料,[2]其他還有Google Reader[3]、Google Maps[4]、Google Book Search、"My Search History"、Google Earth、Blogger.com、Google Code hosting、Orkut[4]、YouTube[5]以及Gmail[6]等。Google自行發展出特別的巨型資料庫的原因,自然是效能的問題[7]。
BigTable不是传统的关系型数据库,不支援JOIN这样的SQL語法,BigTable更像今日的NoSQL的Table-oriented,优势在于扩展性和性能。BigTable的Table資料結構包括row key、col key和timestamp,其中row key用於儲存倒轉的URL,例如www.google.com必須改成com.google.www。BigTable使用大量的Table,在Table之下還有Tablet。每一個Tablets大概有100-200MB,每台机器有100個左右的Tablets。所謂的Table是屬於immutable的SSTables,也就是存储方式不可修改。另外Table還必須進行壓縮,其壓縮又分成table的壓縮或系統的壓縮。客户端有一指向META0的Tablets的指標,META0 tablets保儲所有的META1的tablets的資料記錄。
相關條目
注釋
- ^ "First an overview. BigTable has been in development since early 2004 and has been in active use for about eight months (about February 2005)." Google's BigTable 互联网档案馆的存檔,存档日期2006-06-16.
- ^ "Bigtable can be used with MapReduce, a framework for running large-scale parallel computations developed at Google. We have written a set of wrappers that allow a Bigtable to be used both as an input source and as an output target for MapReduce jobs". pg 3 of "Bigtable: A Distributed Storage System for Structured Data", 2006
- ^ "Reader is using Google's BigTable in order to create a haven for what is likely to be a massive trove of items." Official Google Reader blog.
- ^ 4.0 4.1 "There are currently around 100 cells for services such as Print, Search History, Maps, and Orkut." Google's BigTable 互联网档案馆的存檔,存档日期2006-06-16.
- ^ "Their new solution for thumbnails is to use Google’s BigTable, which provides high performance for a large number of rows, fault tolerance, caching, etc. This is a nice (and rare?) example of actual synergy in an acquisition." YouTube Scalability Talk
- ^ "How Entities and Indexes are Stored - Google App Engine - Google Code" 互联网档案馆的存檔,存档日期2011-10-06.
- ^ "We have described Bigtable, a distributed system for storing structured data at Google....Our users like the performance and high availability provided by the Bigtable implementation, and that they can scale the capacity of their clusters by simply adding more machines to the system as their resource demands change over time...Finally, we have found that there are significant advantages to building our own storage solution at Google. We have gotten a substantial amount of flexibility from designing our own data model for Bigtable." from the Conclusion of "Bigtable: A Distributed Storage System for Structured Data", 2006
外部連結
- Bigtable: A Distributed Storage System for Structured Data -(official paper; PDF)
- BigTable: A Distributed Structured Storage System(video)
- more video
- Google's BigTable -(notes on the official presentation)
- "How Google Works"
- Is the Relational Database Doomed ?