Using IDs for paging prevents the data from being missed due to data changes during the loading process. The current big data platform does not support the update operation, but uses: full outer join + insert overwrite; (i.e., if the day scheduling, the incremental data of the day and the full data of the previous day will be full outer join, and the latest full data will be reloaded) If you are worried about the data updating error: keep each article a latest full-volume version, keep a shorter event cycle. (Alternatively, when there is a physical deletion of data from a table in the business system and the data warehouse needs to retain all the historical data, you can choose this option to keep the latest snapshot of the full-volume data permanently in the data warehouse.)