tencent cloud

Data Lake Compute

Native table (Iceberg) format description

Download
Focus Mode
Font Size
Last updated: 2026-05-27 10:59:37

Native Table (Iceberg) Principle Parsing

The DLC Native Table (Iceberg) uses the Iceberg table format for its underlying storage. In addition to being compatible with the open-source Iceberg capabilities, it enhances performance through separation of storage and computation and improves usability.
The Iceberg table format manages user data by dividing it into data files and metadata files.
Data layer: It consists of a series of data files that store user table data. These data files support Parquet, Avro, and ORC formats, with Parquet being the default format in DLC.
Due to Iceberg's snapshot mechanism, data is not immediately deleted from storage when a user deletes it. Instead, a new delete file is written to record the deleted data. Depending on the use case, delete files are categorized into position delete files and equality delete files.
Position delete files record the information of specific rows that have been deleted within a data file.
Equality delete files record the deletion of specific key values and are typically used in upsert scenarios. Delete file is also a type of data file.
Metadata layer: It consists of a series of manifest files, manifest lists, and metadata files. Manifest files contain metadata for a series of data files, such as file paths, write times, min-max values, and statistics.
A manifest list is composed of manifest files, typically containing the manifest files for a single snapshot.
Metadata files are in JSON format and contain information about a series of manifest list files as well as table metadata, such as table schema, partitions, and all snapshots. Whenever the table status changes, a new metadata file is generated to replace the existing one, with the Iceberg kernel ensuring atomicity for this process.

Native Table (Iceberg) Versions

DLC Native Table (Iceberg) can be divided into Append tables and Upsert tables in terms of usage scenarios. Append tables use the V1 format, while Upsert tables use the V2 format.
Append tables: These tables support only Append, Overwrite, and Merge Into write modes.
Upsert tables: Compared to Append tables, these tables also support the Upsert write mode.

Native Table (Iceberg) Creation Attributes

For better management and use of DLC Native Table (Iceberg), certain attributes need to be specified when you create this type of table. The attributes are as follows. Users can specify these attribute values when creating a table or modify the table's attribute values later. For detailed instructions, see DLC Native Table Operational Configuration.
Attribute Values
Meaning
Configuration Guide
format-version
Iceberg table version: Valid values are 1 and 2, with a default of 1.
If the user's write scenario includes upsert, this value must be set to 2.
write.upsert.enabled
Whether to enable upsert: The value is true; if not set, it will not be enabled.
If the user's write scenario includes upsert, this must be set to true.
write.update.mode
Update Mode
Set to merge-on-read (MOR) for MOR tables; the default is copy-on-write (COW).
write.merge.mode
Merge Mode
Set to merge-on-read (MOR) for MOR tables; the default is copy-on-write (COW).
write.parquet.bloom-filter-enabled.column.{col}
Enable bloom: Set to true to enable it; it is disabled by default.
In upsert scenarios, this must be enabled and configured according to the primary keys from the upstream data. If there are multiple primary keys in the upstream, use up to the first two. Enabling this can improve MOR query performance and small file merging efficiency.
write.distribution-mode
Write Mode
The recommended value is hash. When the value is hash, data will be automatically repartitioned upon writing. However, the drawback is that this may impact write performance.
write.metadata.delete-after-commit.enabled
Enable automatic metadata file cleanup.
It is strongly recommended to set this to true. With this setting enabled, old metadata files will be automatically cleaned up during snapshot creation to prevent the buildup of excess metadata files.
write.metadata.previous-versions-max
Set the default quantity of retained metadata files.
The default value is 100. In certain special cases, users can adjust this value as needed. This setting should be used with write.metadata.delete-after-commit.enabled.
write.metadata.metrics.default
Set the column metrics mode.
The value must be set to full.

Core Capabilities of Native Tables (Iceberg)

ACID Transactions

Writing of Iceberg allows deleting and inserting within a single operation and is not partially visible to users so that it can offer atomic write operations.
Iceberg uses optimistic concurrency control to ensure that data writes do not cause inconsistencies. Users can only see data that has been successfully committed in the read view.
Iceberg uses snapshot mechanisms and serializable isolation levels to ensure that reads and writes are isolated.
Iceberg ensures that transactions are durable; once a transaction is successfully committed, it is permanent.

Writing

The writing process follows optimistic concurrency control. Writers assume that the current table version will not change before they commit their updates. They update, delete, or add data and create a new version of the metadata file. When the current version is replaced with the new version, Iceberg verifies that the updates are based on the current snapshot.
If not, it indicates a write conflict, meaning that another writer has already updated the current metadata. In this case, the write operation must be updated again based on the current metadata version. The entire submission and replacement process is ensured to be atomic by the metadata lock.

Reading

Reading and writing of Iceberg are independent processes. Readers can only see snapshots that have been successfully committed. By accessing the version's metadata file, readers obtain snapshot information to read the current table data. Since metadata files are not updated until write operations are complete, this ensures that data is always read from completed operations and never from ongoing write operations.

Conflict Parameter Configuration

When write concurrency increases, DLC managed tables (Iceberg) may encounter write conflicts. To reduce the frequency of conflicts, users can make reasonable adjustments to their businesses in the following ways.
Go to the setting of the table structure for merging, such as partitioning, to reasonably plan the write scope of jobs. This reduces the write time of tasks and, to some extent, lowers the probability of concurrent conflicts.
Merge jobs to a certain extent to reduce the level of write concurrency.
DLC also supports a series of conflict retry parameters and increases the success rate of retry operations to some extent, thereby reducing the impact on business operations. The meanings of parameters and configuration guidance are as follows.
Attribute values
Default System values
Meanings
Configuration guide
commit.retry.num-retries
4
Number of retries after a submission failure
When retries occur, you can try increasing the number of attempts.
commit.retry.min-wait-ms
100
Minimum time for waiting before retrying, in milliseconds
If conflicts are very frequent and persist even after waiting for a while, you can try to adjust this value to increase the interval between retries.
commit.retry.max-wait-ms
60000(1 min)
Maximum time for waiting before retrying, in milliseconds
Adjust this value with commit.retry.min-wait-ms.
commit.retry.total-timeout-ms
1800000(30 min)
Timeout for the process of submitting the entire retry
-

Hidden Partitioning

DLC Native Table (Iceberg) hidden partitioning hides the partition information. Developers only need to specify the partition policy when creating the table. Iceberg maintains the logical relationship between table fields and data files according to this policy. During writing and querying, there is no need to be concerned about the partition layout. Iceberg finds the partition information based on the partitioning policy and records it in the metadata during data writing. When querying, it uses the metadata to filter out files that do not need to be scanned. The partition policies provided by DLC Native Table (Iceberg) are shown in the table below.
Transformation policy
Description
Types of original fields
Types after transformation
identity
No transformation
All types
Being consistent with the original type
bucket[ N, col]
Hash bucketing
int, long, decimal, date, time, timestamp, timestamptz, string, uuid, fixed, binary
int
truncate[ col]
Fixed-length truncation
int, long, decimal, string
Being consistent with the original type
year
Extract year information from fields
date, timestamp, timestamptz
int
month
Extract month information from fields
date, timestamp, timestamptz
int
day
Extract day information from fields
date, timestamp, timestamptz
int
hour
Extract hour information from fields
timestamp, timestamptz
int

Process of Querying and Storing Metadata

DLC Native Table (Iceberg) allows you to call stored procedure statements to query information about various types of tables, such as file merges and snapshot expiration. The table below provides some common query methods. For specific syntax, see Iceberg table syntax.
Scenario
CALL statements
Execution Engine
Querying history
select * from DataLakeCatalog.db.sample$history
SuperSQL Spark (sql) engine and SuperSQL Presto engine
select * from `DataLakeCatalog`.`db`.`sample`.`history`
SuperSQL Spark (job) engine and standard Spark engine
Querying snapshot
select * from DataLakeCatalog.db.sample$snapshots
SuperSQL Spark (sql) engine and SuperSQL Presto engine
select * from `DataLakeCatalog`.`db`.`sample`.`snapshots`
SuperSQL Spark (job) engine and standard Spark engine
Querying data files
select * from DataLakeCatalog.db.sample$files
SuperSQL Spark (sql) engine and SuperSQL Presto engine
select * from `DataLakeCatalog`.`db`.`sample`.`files`
SuperSQL Spark (job) engine and standard Spark engine
Querying manifests
select * from DataLakeCatalog.db.sample$manifests
SuperSQL Spark (sql) engine and SuperSQL Presto engine
select * from `DataLakeCatalog`.`db`.`sample`.`manifests`
SuperSQL Spark (job) engine and standard Spark engine
Querying partitions
select * from DataLakeCatalog.db.sample$partitions
SuperSQL Spark (sql) engine and SuperSQL Presto engine
select * from `DataLakeCatalog`.`db`.`sample`.`partitions`
SuperSQL Spark (job) engine and standard Spark engine
Rolling back the specified snapshot
CALL DataLakeCatalog.system.rollback_to_snapshot('db.sample', 1)
SuperSQL Spark engine and standard Spark engine
Rolling back to a specific point in time
CALL DataLakeCatalog.system.rollback_to_timestamp('db.sample', TIMESTAMP '2021-06-30 00:00:00.000')
SuperSQL Spark engine and standard Spark engine
Setting the current snapshot
CALL DataLakeCatalog.system.set_current_snapshot('db.sample', 1)
SuperSQL Spark engine and standard Spark engine
Merging files
CALL DataLakeCatalog.system.rewrite_data_files(table => 'db.sample', strategy => 'sort', sort_order => 'id DESC NULLS LAST,name ASC NULLS FIRST')
SuperSQL Spark engine and standard Spark engine
Snapshot expiration
CALL DataLakeCatalog.system.expire_snapshots('db.sample', TIMESTAMP '2021-06-30 00:00:00.000', 100)
SuperSQL Spark engine and standard Spark engine
Removing orphan files
CALL DataLakeCatalog.system.remove_orphan_files(table => 'db.sample', dry_run => true)
SuperSQL Spark engine and standard Spark engine
Rewriting metadata
CALL DataLakeCatalog.system.rewrite_manifests('db.sample')
SuperSQL Spark engine and standard Spark engine

Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback