+ Reply to Thread
Results 1 to 9 of 9

Dynamic Unidata File

  1. Dynamic Unidata File

    I have a large dynamic file and the performance is poor, below is the
    file.stat.
    I know enough about UniData to handle static files, but no experience
    with dynamic files...

    Can one of you experts take a quick look and see if you have any
    recommendations?

    File name(Dynamic File) = MY.FILE
    Number of groups in file (modulo) = 935054
    Dynamic hashing, hash type = 0
    Split/Merge type = KEYONLY
    Block size = 1024
    File has 935054 groups in level one overflow.
    Number of records = 25886073
    Total number of bytes = 4724285816

    Average number of records per group = 27.7
    Standard deviation from average = 0.7
    Average number of bytes per group = 5052.4
    Standard deviation from average = 379.3

    Average number of bytes in a record = 182.5
    Average number of bytes in record ID = 8.6
    Standard deviation from average = 69.1
    Minimum number of bytes in a record = 68
    Maximum number of bytes in a record = 32537

    Minimum number of fields in a record = 11
    Maximum number of fields in a record = 51
    Average number of fields per record = 41.1
    Standard deviation from average = 7.7
    File has 5 over files, 1 prime files


    Thanks,
    -Larry

  2. Re: Dynamic Unidata File

    Oh... I'm running Unidata 6.0 on HP-UX 11.11.

  3. Re: Dynamic Unidata File

    Hi Larry,

    Something looks very wrong with this file. Can you please run
    ANALYZE.FILE and post the results (or just the start of the report).

    It looks as though the file is not splitting for some reason. I have
    seen this a few times with Unidata and never managed to resolve what
    is happening. One possibility is that your application does SELECT
    operations against the file that don't run to completion. The split/
    merge process is disabled while a select is in progress to ensure that
    the select list is correct rather than omitting records moved by a
    merge or showing records moved by a split twice. A hint about whether
    this might be the problem can be obtained by running
    sms -d
    from the operating system command prompt. If I remember rightly, the
    column headed "flag" is the count of selects in progress. Splits and
    merges are suspended while this is non-zero.

    I have just tried to verify all of this on a Windows system and
    discovered two new problems. I created a dynamic file named DH and
    populated it by copying the VOC into the file. I then started a select
    operation and ran the sms command. This crashed with a "Unidata has
    encountered a problem- send report to Microsoft?" window. Great!

    I then discovered something very odd. If I copy the VOC to a
    previously empty dynamic file, it grows to modulo 32. I then delete
    all the records. The modulo should fall to the value I chose as the
    minimum (3) but it stays at 32. If I then copy the VOC records in
    again, the modulo becomes 33. Every time I go round this cycle of
    deleting the records and reinserting them, the modulo goes up by one.

    Something is not right with Unidata dynamic files.


    Martin Phillips, Ladybridge Systems.



  4. Re: Dynamic Unidata File

    On Jan 12, 8:13*am, Martin Phillips
    wrote:
    >
    > Something looks very wrong with this file. Can you please run
    > ANALYZE.FILE and post the results (or just the start of the report).
    >

    Here is the start of the report:

    Dynamic File name = MY.FILE
    Number of groups in file (modulo) = 935054
    Minimum groups of file = 467527
    Hash type = 0, blocksize = 1024
    Split load = 60, Merge load = 40
    Split/Merge type = KEYONLY

    Group Keys Key Loads Percent
    =================================================
    0 27 448 43
    1 27 448 43
    2 27 448 43
    3 27 448 43
    4 27 448 43
    5 27 448 43
    6 27 448 43
    7 27 448 43
    8 27 448 43
    9 27 448 43
    10 27 448 43

  5. Re: Dynamic Unidata File

    Larry,

    The reason it is spliting is the split is set at 60 and should be 20
    or less. Your overflow looks like it 4 times the size of the group
    size. My off the cuff recommendation is that you do the following:

    CONFIGURE.FILE MY.FILE SPLIT.LOAD 20 MERGE.LOAD 5
    !MEMERESIZE MY.FILE 583523,8 MEMORY 512000

    If you have a gigabyte of memory you can increase the 512000 to to
    1024000. Let me know how it turns out.

    Regards,
    Doug
    www.u2logic.com


  6. Re: Dynamic Unidata File

    On Jan 12, 4:59*pm, "dave...@gmail.com" wrote:
    > My off the cuff recommendation is that you do the following:
    >
    > CONFIGURE.FILE MY.FILE SPLIT.LOAD 20 MERGE.LOAD 5
    > !MEMERESIZE MY.FILE 583523,8 MEMORY 512000
    >


    I tried this (after a couple mods to make it work for Unix) and got a
    pretty hairy error message.
    Basically it looks like it tried to create a file over 2GB and crapped
    out.

    Any other ideas?

  7. Re: Dynamic Unidata File

    On Jan 12, 11:35 am, laurence.sto...@cardinal.com wrote:
    > On Jan 12, 8:13 am, Martin Phillips
    > wrote:
    >
    > > Something looks very wrong with this file. Can you please run
    > > ANALYZE.FILE and post the results (or just the start of the report).

    >
    > Here is the start of the report:
    >
    > Dynamic File name = MY.FILE
    > Number of groups in file (modulo) = 935054
    > Minimum groups of file = 467527
    > Hash type = 0, blocksize = 1024
    > Split load = 60, Merge load = 40
    > Split/Merge type = KEYONLY
    >
    > Group Keys Key Loads Percent
    > =================================================
    > 0 27 448 43
    > 1 27 448 43
    > 2 27 448 43
    > 3 27 448 43
    > 4 27 448 43
    > 5 27 448 43
    > 6 27 448 43
    > 7 27 448 4
    > 8 27 448 43
    > 9 27 448 43
    > 10 27 448 43


    It's been a while since I've dealt with Unidata file sizing, but here
    is how I would proceed:

    Note that the Split/Merge type is KEYONLY, and the Split Load is set
    at 60. The group detail listing shows that your key loads are only
    43%, so none of those groups will split yet. Every group is in level
    one overflow, which means that the data won't all fit into the block
    size you've set (1024).

    I sould set the Split/Merge type to KEYDATA, so that the size of the
    *data* has an impact on the splitting behavior. Unidata recommends
    10-20 records per group as a rule of thumb, so I'll use 15 for my
    math. That means you need (183 * 15 ) = 2745 bytes per group, which I
    would then round up to the next power of 2, or 4096.

    Unidata sets aside a percentage of the first block in each group for
    key storage. (20% If I remember correctly.) So, to avoid level one
    (data) overflow, we're at (4096 * 0.8 ) = 3276 bytes per group for
    data storage. You'll get ( 3276 / 183 ) = 17.9 records per group.
    Let's call that 18. Since you have 25,886,073 records, you'll need
    (25,886,073 / 18 ) = 1,438,115 groups.

    So, create a file like this:

    :CREATE.FILE NEWFILE 4,1438115 DYNAMIC KEYDATA

    Unidata will figure out if that's a prime number, and round up for you
    if necessary. Finally, copy all of the records from your old file into
    the new file, then delete the old file and rename the new one
    accordingly. Don't forget to copy the DICT records first!

    Hope this helps.

    Jeff

  8. Re: Dynamic Unidata File

    > So, create a file like this:
    >
    > :CREATE.FILE NEWFILE 4,1438115 DYNAMIC KEYDATA
    >
    > Hope this helps.
    >


    Hey Jeff,

    Thanks for the advice.

    I did it a little differently, I resized the file using your
    calculations:
    !memresize MY.FILE 1438115,4 DYNAMIC KEYDATA
    and the speed of file access is vastly imporved!!
    A program that used to take 24 hours now runs in 3!!


    Thanks again,
    -Larry


  9. Re: Dynamic Unidata File

    On Jan 18, 4:52 pm, laurence.sto...@cardinal.com wrote:
    >
    > I did it a little differently, I resized the file using your
    > calculations:
    > !memresize MY.FILE 1438115,4 DYNAMIC KEYDATA
    > and the speed of file access is vastly imporved!!
    > A program that used to take 24 hours now runs in 3!!
    >
    > Thanks again,
    > -Larry


    Glad to hear about the speed improvement! I suggested the copy-then-
    rename stragegy because of the issue reported earlier with memresize.

    It would be interesting to see what the guide utility says about
    predicted optimal file size. I haven't found it to be very good with
    dynamic files.

    Jeff

+ Reply to Thread