Is Your Latest Data Really the Latest? Check Your Data Update Mechanism

Featured Imgs 26

In databases, data update is to add, delete, or modify data. Timely data update is an important part of high-quality data services.

Technically speaking, there are two types of data updates: you either update a whole row (Row Update) or just update part of the columns (Partial Column Update). Many databases support both of them but in different ways. This post is about one of them, which is simple in execution and efficient in data quality guarantee. 

Hot-Cold Data Separation: What, Why, and How?

Featured Imgs 26

Apparently, hot-cold data separation is hot now. But first of all:

What Is Hot/Cold Data?

In simple terms, hot data is the frequently accessed data, while cold data is the one you seldom visit but still need. Normally in data analytics, data is "hot" when it is new and gets "colder" and "colder" as time goes by. 

Say Goodbye to OOM Crashes

Featured Imgs 23

What guarantees system stability in large data query tasks? It is an effective memory allocation and monitoring mechanism. It is how you speed up computation, avoid memory hotspots, promptly respond to insufficient memory, and minimize OOM errors. 

From a database user's perspective, how do they suffer from bad memory management? This is a list of things that used to bother our users:

Understanding Data Compaction in 3 Minutes

Featured Imgs 23

What is compaction in the database? Think of your disks as a warehouse: The compaction mechanism is like a team of storekeepers (with genius organizing skills like Marie Kondo) who help put away the incoming data. 

In particular, the data (which is the inflowing cargo in this metaphor) comes in on a "conveyor belt," which does not allow cutting in line. This is how the LSM-Tree (Log Structured-Merge Tree) works: In data storage, data is written into MemTables in an append-only manner, and then the MemTables are flushed to disks to form files. (These files go by different names in different databases. In my community, we call them Rowsets). Just like putting small boxes of cargo into a large container, compaction means merging multiple small rowset files into a big one, but it does much more than that. As I said, the compaction mechanism is an organizing magician: 

Building A Log Analytics Solution 10 Times More Cost-Effective Than Elasticsearch

Featured Imgs 29

Logs often take up the majority of a company's data assets. Examples of logs include business logs (such as user activity logs) and Operation and Maintenance logs of servers, databases, and network or IoT devices.

Logs are the guardian angel of business. On the one hand, they provide system risk alerts and help engineers quickly locate root causes in troubleshooting. On the other hand, if you zoom them out by time range, you might identify some helpful trends and patterns, not to mention that business logs are the cornerstone of user insights.