+ Reply to Thread
Results 1 to 8 of 8

Best way to delete million records

  1. Best way to delete million records

    Hi all,

    I have to delete about 43 millions rows from a table, but the following
    conditions exist:

    - The table is not partitioned and unfortunately I can't modify the
    structure
    - The delete procedure must be done without stopping service, so I can't
    use CTAS, moving records and renaming table.
    - I can't use truncate because the records affected are selected in a
    range of dates

    Waiting your suggestions and probable problems for this procedure.

    Thanks for help.

    Bye

  2. Re: Best way to delete million records

    On Jun 30, 10:00*am, fefe78 wrote:
    > I have to delete about 43 millions rows from a table, but the following
    > conditions exist:
    >
    > - The table is not partitioned and unfortunately I can't modify the
    > structure
    > - The delete procedure must be done without stopping service, so I can't
    > use CTAS, moving records and renaming table.
    > - I can't use truncate because the records affected are selected in a
    > range of dates
    >
    > Waiting your suggestions and probable problems for this procedure.


    If you do not have proper indexes in place not matter what you need to
    at least once run through the whole table. If this is a task you face
    regularly I'd probably bite the bullet and create appropriate indexes
    or - even better - partitions. If this is just done on few occasions
    it might be better to just issue the DELETE and wait. Watch out for
    undo / redo sizes though.

    Kind regards

    robert

  3. Re: Best way to delete million records

    The table is indexed, and this is a one time procedure.
    After this will be scheduled a job to daily delete old records.
    In some forums I red that drop indexes and ricreate them after deletion,
    could be a good solution.

    Robert Klemme ha scritto:
    > On Jun 30, 10:00 am, fefe78 wrote:
    >> I have to delete about 43 millions rows from a table, but the following
    >> conditions exist:
    >>
    >> - The table is not partitioned and unfortunately I can't modify the
    >> structure
    >> - The delete procedure must be done without stopping service, so I can't
    >> use CTAS, moving records and renaming table.
    >> - I can't use truncate because the records affected are selected in a
    >> range of dates
    >>
    >> Waiting your suggestions and probable problems for this procedure.

    >
    > If you do not have proper indexes in place not matter what you need to
    > at least once run through the whole table. If this is a task you face
    > regularly I'd probably bite the bullet and create appropriate indexes
    > or - even better - partitions. If this is just done on few occasions
    > it might be better to just issue the DELETE and wait. Watch out for
    > undo / redo sizes though.
    >
    > Kind regards
    >
    > robert


  4. Re: Best way to delete million records

    fefe78 wrote:
    > The table is indexed, and this is a one time procedure.


    What is it indexed on?

    > After this will be scheduled a job to daily delete old records.


    Given this, I doubt that all 43 million rows need to be deleted in
    a single atomic transaction. If that is the case, breaking it into smaller
    transactions could prevent causing havoc with the undo.


    > In some forums I red that drop indexes and ricreate them after deletion,
    > could be a good solution.


    If the indexes are important, then dropping them will effectively "stop
    service", which you say is not allowed.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    The costs of publication of this article were defrayed in part by the
    payment of page charges. This article must therefore be hereby marked
    advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
    this fact.

  5. Re: Best way to delete million records

    On Jun 30, 1:00*am, fefe78 wrote:
    > Hi all,
    >
    > I have to delete about 43 millions rows from a table, but the following
    > conditions exist:
    >
    > - The table is not partitioned and unfortunately I can't modify the
    > structure
    > - The delete procedure must be done without stopping service, so I can't
    > use CTAS, moving records and renaming table.
    > - I can't use truncate because the records affected are selected in a
    > range of dates
    >
    > Waiting your suggestions and probable problems for this procedure.
    >
    > Thanks for help.
    >
    > Bye


    Xho's answer is the one that immediately comes to mind, how long do
    you have to do this, and how many rows are you not deleting?

    Sometimes when you calculate all the costs and show all the possible
    ways to do it, 7/24 uptime becomes a nice-to-have rather than the
    absolute directive.

    Another possibility may be to split table in two, replacing it with a
    view, and slowly move the rows to the archive table and truncate.
    This may have some bad performance implications, depending on how the
    table is accessed.

    You may very well need to rebuild/coalesce indices or move stuff
    around in the table, if you are deleting a large proportion of the
    rows.

    Your options are version-dependent.

    jg
    --
    @home.com is bogus.
    "If the wings are traveling faster than the fuselage, it's probably a
    helicopter - and therefore, unsafe."

  6. Re: Best way to delete million records

    On Mon, 30 Jun 2008 10:00:51 +0200, fefe78 wrote:

    > Hi all,
    >
    > I have to delete about 43 millions rows from a table, but the following
    > conditions exist:
    >
    > - The table is not partitioned and unfortunately I can't modify the
    > structure
    > - The delete procedure must be done without stopping service, so I can't
    > use CTAS, moving records and renaming table. - I can't use truncate
    > because the records affected are selected in a range of dates
    >
    > Waiting your suggestions and probable problems for this procedure.
    >
    > Thanks for help.
    >
    > Bye


    You can also write a PL/SQL script that would delete million by
    million records and commit it every time. That will prevent you
    from running out of the UNDO space but will probably run for a week,
    in the background. To delete 43 million records is a major intervention
    and if you want to do it in a non-invasive fashion, background job running
    a PL/SQL delete script is, in my opinion, the best way. Of course, you
    should also be checking space on your archive log destination. Something
    like this is more then likely to create a major log switching storm.
    Expect a checkpoint every few seconds, with an occasional "checkpoint not
    complete" message. Users are more then likely to notice that kind of
    activity. It it is possible, you can launch several delete scripts, one
    deleting Mondays, one Tuesdays and so on. If your machine capacity is
    good enough, the machine will survive. To survive a thing like that, you
    should have at least 6 CPUs.
    Also, if there is a standby, it needs to be carefully monitored, too. This
    might flood your 1GB LAN card.



    --
    http://mgogala.freehostia.com

  7. Re: Best way to delete million records

    On 30 jun, 05:00, fefe78 wrote:
    > Hi all,
    >
    > I have to delete about 43 millions rows from a table, but the following
    > conditions exist:
    >
    > - The table is not partitioned and unfortunately I can't modify the
    > structure
    > - The delete procedure must be done without stopping service, so I can't
    > use CTAS, moving records and renaming table.
    > - I can't use truncate because the records affected are selected in a
    > range of dates
    >
    > Waiting your suggestions and probable problems for this procedure.
    >
    > Thanks for help.
    >
    > Bye


    Hello,

    i have a similar problem and use someting like this:
    -----------------------------
    DECLARE

    -- 1 month
    ITERACIONES CONSTANT NUMBER := 30;

    -- 2 years. your oldest record timestamp
    OLDEST_DATE_ CONSTANT NUMBER := 730;
    oldest_date DATE;

    KEEP_DATE CONSTANT NUMBER := 60;

    STEP_DATE CONSTANT NUMBER := 1;
    loop_date DATE;

    BEGIN
    dbms_output.put_line('.- Iniciando a las: ' || SYSDATE );
    -- Inicializamos
    -- set server outout on format wrapped;
    SELECT SYSDATE - OLDEST_DATE_ INTO oldest_date FROM DUAL;
    SELECT oldest_date INTO loop_date FROM DUAL;

    FOR x IN 1..ITERACIONES LOOP
    dbms_output.put_line('Iniciando procedimiento nro: ' || x);
    dbms_output.put_line('borrando registros entre Fechas ' ||
    oldest_date || ' y ' || loop_date );

    -- DELETE...;
    DELETE FROM iwaylogs.cm_leases
    WHERE iwaylogs.cm_leases.fecha BETWEEN oldest_date AND
    loop_date;
    -- COMMIT;
    COMMIT;

    -- adelanto la fecha
    oldest_date := loop_date;
    -- adelanto el loop
    loop_date := loop_date + STEP_DATE;

    -- Si ya llegue a borrar hasta donde quiero salgo
    IF ( loop_date >= (SYSDATE - KEEP_DATE) ) THEN
    dbms_output.put_line('Llegamos a los 60 dias,
    saliendo');
    EXIT;
    END IF;
    dbms_output.put_line('Procedimiento listo.');
    -- Esperamos XX segundos para 'descansar' la DB :)
    dbms_lock.sleep(90);
    END LOOP;

    dbms_output.put_line('Finalizado' );

    END;

    ------------------------------------------------------
    )

  8. Re: Best way to delete million records


    schreef in bericht
    news:cf74d9a0-64b6-4d92-a415-d6823fcb527b@l42g2000hsc.googlegroups.com...
    > On 30 jun, 05:00, fefe78 wrote:
    >> Hi all,
    >>
    >> I have to delete about 43 millions rows from a table, but the following
    >> conditions exist:
    >>
    >> - The table is not partitioned and unfortunately I can't modify the
    >> structure
    >> - The delete procedure must be done without stopping service, so I can't
    >> use CTAS, moving records and renaming table.
    >> - I can't use truncate because the records affected are selected in a
    >> range of dates
    >>
    >> Waiting your suggestions and probable problems for this procedure.
    >>
    >> Thanks for help.
    >>
    >> Bye

    >
    > Hello,
    >
    > i have a similar problem and use someting like this:
    > -----------------------------
    > DECLARE
    >
    > -- 1 month
    > ITERACIONES CONSTANT NUMBER := 30;
    >
    > -- 2 years. your oldest record timestamp
    > OLDEST_DATE_ CONSTANT NUMBER := 730;
    > oldest_date DATE;
    >
    > KEEP_DATE CONSTANT NUMBER := 60;
    >
    > STEP_DATE CONSTANT NUMBER := 1;
    > loop_date DATE;
    >
    > BEGIN
    > dbms_output.put_line('.- Iniciando a las: ' || SYSDATE );
    > -- Inicializamos
    > -- set server outout on format wrapped;
    > SELECT SYSDATE - OLDEST_DATE_ INTO oldest_date FROM DUAL;
    > SELECT oldest_date INTO loop_date FROM DUAL;
    >
    > FOR x IN 1..ITERACIONES LOOP
    > dbms_output.put_line('Iniciando procedimiento nro: ' || x);
    > dbms_output.put_line('borrando registros entre Fechas ' ||
    > oldest_date || ' y ' || loop_date );
    >
    > -- DELETE...;
    > DELETE FROM iwaylogs.cm_leases
    > WHERE iwaylogs.cm_leases.fecha BETWEEN oldest_date AND
    > loop_date;
    > -- COMMIT;
    > COMMIT;
    >
    > -- adelanto la fecha
    > oldest_date := loop_date;
    > -- adelanto el loop
    > loop_date := loop_date + STEP_DATE;
    >
    > -- Si ya llegue a borrar hasta donde quiero salgo
    > IF ( loop_date >= (SYSDATE - KEEP_DATE) ) THEN
    > dbms_output.put_line('Llegamos a los 60 dias,
    > saliendo');
    > EXIT;
    > END IF;
    > dbms_output.put_line('Procedimiento listo.');
    > -- Esperamos XX segundos para 'descansar' la DB :)
    > dbms_lock.sleep(90);
    > END LOOP;
    >
    > dbms_output.put_line('Finalizado' );
    >
    > END;
    >
    > ------------------------------------------------------
    > )


    Why don't you just do a delete like
    where fecha between oldest_date and oldest_date+30 ? It saves you a loop and
    you do it all in one transaction.... and you don't have to 'descansar' -
    sleep - your DB.
    (I noted you'll have to keep track of KEEP_DATE as well, so you'll have to
    do a MIN( ) in your comparison somewhere)
    Furthermore, your loop sleeps 90 seconds every loop, which makes your
    program run at least in 30*90 seconds = 2700 seconds is 45 minutes... even
    if there is nothing to delete in the first 29 loops!!! Which will be the
    case if you run your script once a day. If you run it several times a day,
    it is even doing nothing at all for 45 minutes.
    (delete nothing, wait 90 seconds, delete nothing, wait 90 seconds and so
    on....)

    Shakespeare




+ Reply to Thread