-
creating mmap-backed DbEnv with MAP_NOSYNC
Hello,
I'm in the process of converting an app to BerkeleyDb (from MySQL, but
that's not really relevant to the issue at hand) and am trying to get it
to run nicely with an mmap-based environment. I'm using FreeBSD 5.2.1-R,
and the machines in use are currently an Athlon 3000+, 1GB RAM and a
Pentium 4 2.8GHz, 2GB RAM. I could use the DB_SYSTEM_MEM option on my
system but I'm limited by a rather low OS default IPC max shared memory
value (32MB) and I'd like to avoid having to tune the production
machine(s) to my 1 app.
When I first ran the program which generates large amounts of data
(reads a 100MB database, produces 15GB), I noticed the program would
regularly pause and write to disk a lot, but db_stat -m's count of
flushed pages wouldn't increase. 'top' indicated the program's state was
'vmpfw'. This behavior was not evident when I used DB_SYSTEM_MEM using
the same size cache. I finally figured out that it was the result of the
OS syncing the mmap'ed memory back to disk.
FreeBSD has a flag to mmap(2) to disable this behavior, but a quick grep
of the BerkeleyDb sources (4.2.52) doesn't indicate that it's used. Are
there any reasons you can think of why BerkeleyDb shouldn't already use
this option if it's available? Do you forsee any issued with me patching
my BerkeleyDb install with this flag?
FreeBSD's mmap manpage with a description of MAP_NOSYNC. I've quoted the
first paragraph below.
http://www.freebsd.org/cgi/man.cgi?q...ts&format=html
MAP_NOSYNC
Causes data dirtied via this VM map to be flushed to
physical media only when necessary (usually by the
pager) rather than gratuitously. Typically this pre-
vents the update daemons from flushing pages dirtied
through such maps and thus allows efficient sharing of
memory across unassociated processes using a file-
backed shared memory map. Without this option any VM
pages you dirty may be flushed to disk every so often
(every 30-60 seconds usually) which can create perfor-
mance problems if you do not need that to occur (such
as when you are using shared file-backed mmap regions
for IPC purposes). Note that VM/file system coherency
is maintained whether you use MAP_NOSYNC or not. This
option is not portable across UNIX platforms (yet),
though some may implement the same behavior by default.
-
Re: creating mmap-backed DbEnv with MAP_NOSYNC
Ludwig Pummer wrote in message news:...
> FreeBSD has a flag to mmap(2) to disable this behavior, but a quick grep
> of the BerkeleyDb sources (4.2.52) doesn't indicate that it's used. Are
> there any reasons you can think of why BerkeleyDb shouldn't already use
> this option if it's available? Do you forsee any issued with me patching
> my BerkeleyDb install with this flag?
Berkeley DB doesn't specify the MAP_NOSYNC flag because we
weren't aware of it, as far as I know. Thanks for pointing
this out. I've made the appended change, and we'll begin
testing with it soon.
Regards,
--keith
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Keith Bostic bostic@sleepycat.com
Sleepycat Software Inc. keithbosticim (ymsgid)
118 Tower Rd. +1-781-259-3139
Lincoln, MA 01773 http://www.sleepycat.com
*** os/os_map.c.orig Tue Jul 1 15:47:15 2003
--- os/os_map.c Tue Apr 6 07:00:34 2004
***************
*** 379,384 ****
--- 379,397 ----
COMPQUIET(is_region, 0);
#endif
+ /*
+ * FreeBSD:
+ * Causes data dirtied via this VM map to be flushed to physical media
+ * only when necessary (usually by the pager) rather then gratuitously.
+ * Typically this prevents the update daemons from flushing pages
+ * dirtied through such maps and thus allows efficient sharing of
+ * memory across unassociated processes using a file-backed shared
+ * memory map.
+ */
+ #ifdef MAP_NOSYNC
+ flags |= MAP_NOSYNC;
+ #endif
+
prot = PROT_READ | (is_rdonly ? 0 : PROT_WRITE);
/*