How to list Vertica Tables by number of rows ?

When working with Vertica, you may need to find out which tables have the largest number of rows. To do so, you must tap into the storage_containers system table, that will give metrics about the number of rows, AND the number of deleted rows. But since containers are distributed, you will need to aggregate this data, and ensure to limit your query to the super-projection, which guarantee to contain all table info. In brief :

with num_rows as (
    select schema_name,
           anchor_table_name as table_name,
           sum(total_row_count - deleted_row_count) as rows
    from v_monitor.storage_containers sc
    join v_catalog.projections p
         on sc.projection_id = p.projection_id
         and p.is_super_projection = true
    group by schema_name,
             table_name,
             sc.projection_id
)
select schema_name,
       table_name,
       max(rows) as rows
from num_rows
group by schema_name,
         table_name
order by rows desc;

Leave a Reply

Your email address will not be published. Required fields are marked *