+ Reply to Thread
Results 1 to 3 of 3

DB2 High Performance Unload + remove unwanted data

  1. DB2 High Performance Unload + remove unwanted data

    Hello,

    Linux RedHat AS4 running DB2 V8 FP15.

    I have very large tables (100 Million+ rows), and I only need to keep
    6 months of data online/available. The tables have a timestamp field.

    I was hoping that DB2 High Performance Unload V3 allow me to 'extract
    and delete' data from tablespaces very quickly, but unfortunately it
    is only able to extract/export data (much faster than regular 'export'
    statements).

    Silly question: How do I efficiently remove unwanted data from very
    large tables, without having to take my applications down ? DELETE
    statements are fairly slow/unefficient with large tables.

    PS: High Performance Unload is a great tool. You can extract tables
    (full or parts of it) from a backup image. It is a must have for all
    DBAs.

    Thanks


  2. Re: DB2 High Performance Unload + remove unwanted data

    Michel Esber wrote:
    > Hello,
    >
    > Linux RedHat AS4 running DB2 V8 FP15.
    >
    > I have very large tables (100 Million+ rows), and I only need to keep
    > 6 months of data online/available. The tables have a timestamp field.
    >
    > I was hoping that DB2 High Performance Unload V3 allow me to 'extract
    > and delete' data from tablespaces very quickly, but unfortunately it
    > is only able to extract/export data (much faster than regular 'export'
    > statements).
    >
    > Silly question: How do I efficiently remove unwanted data from very
    > large tables, without having to take my applications down ? DELETE
    > statements are fairly slow/unefficient with large tables.

    Ideally you set up your table design to encourage the rollout (MDC
    (block deletes), range partitioning (DETACH), UNION ALL view (cycle
    branches).
    Then you can do some smarter ways of doing mass deletes. Som eof which
    have been discussed here or you can find them in my "SQL on Fire"
    presentation (google)

    Cheers
    Serge


    --
    Serge Rielau
    DB2 Solutions Development
    IBM Toronto Lab

  3. Re: DB2 High Performance Unload + remove unwanted data

    "Michel Esber" wrote in message
    news:1194882454.212731.171150@v29g2000prd.googlegroups.com...
    > Hello,
    >
    > Linux RedHat AS4 running DB2 V8 FP15.
    >
    > I have very large tables (100 Million+ rows), and I only need to keep
    > 6 months of data online/available. The tables have a timestamp field.
    >
    > I was hoping that DB2 High Performance Unload V3 allow me to 'extract
    > and delete' data from tablespaces very quickly, but unfortunately it
    > is only able to extract/export data (much faster than regular 'export'
    > statements).
    >
    > Silly question: How do I efficiently remove unwanted data from very
    > large tables, without having to take my applications down ? DELETE
    > statements are fairly slow/unefficient with large tables.
    >
    > PS: High Performance Unload is a great tool. You can extract tables
    > (full or parts of it) from a backup image. It is a must have for all
    > DBAs.
    >
    > Thanks


    If you cannot bring your application down at all, then I would do the
    following:

    Write a stored procedure that processes the rows to be deleted in a cursor
    WITH UR and WITH HOLD (so as not to close the cursor when a commit happens).
    Do not put an order by in the cursor.

    Then as you fetch each row in the cursor and a row is to be deleted, then do
    a separate delete statement for that row. Do a commit every 100 deleted rows
    (this might vary depending on your concurrency requirements) or every x
    minutes (whichever comes first).

    This is not the fastest way to delete rows, but will let your application
    access the table the entire time you are deleting rows.

    Then you will want to do an online reorg, which you can stop and start
    multiple times if you cannot do it all at once (for example, if you only
    wanted to run the online reorg during late hours of the night when activity
    on your system is slow).

    Setting the DB2_SKIPDELETED=YES environment variable (with db2set) will
    also improve concurrency while you are doing deletes.



+ Reply to Thread