use CDB_File;
tie %h, 'CDB_File', 'file.cdb' or die "tie failed: $!\n";
$t = new CDB_File ('t.cdb', 't.tmp') or die ...;
$t->insert('key', 'value');
$t->finish;
CDB_File::create %t, $file, "$file.$$";
or
use CDB_File 'create';
create %t, $file, "$file.$$";
cdb is a fast, reliable, lightweight package for creating and
reading constant databases.
After the tie shown above, accesses to %h will refer to the cdb file file.cdb, as described in perlfunc.
A cdb file is created in three steps. First call new CDB_File
($final, $tmp), where $final is the name of the database to be created, and $tmp is the name of a temporary file which can be atomically renamed to $final. Secondly, call the insert method once for each (key, value) pair. Finally, call the finish
method to complete the creation and renaming of the cdb file.
A simpler interface to cdb file creation is provided by
CDB_File::create %t, $final, $tmp. This creates a cdb file named
$final containing the contents of %t. As before, $tmp must name a temporary file which can be atomically renamed to $final.
CDB_File::create may be imported.
1. Convert a Berkeley DB (B-tree) database to cdb format.
use CDB_File;
use DB_File;
tie %h, DB_File, $ARGV[0], O_RDONLY, undef, $DB_BTREE or
die "$0: can't tie to $ARGV[0]: $!\n";
CDB_File::create %h, $ARGV[1], "$ARGV[1].$$" or
die "$0: can't create cdb: $!\n";
2. Convert a flat file to cdb format. In this example, the flat file consists of one key per line, separated by a colon from the value. Blank lines and lines beginning with # are skipped.
use CDB_File;
$cdb = new CDB_File("data.cdb", "data.$$") or
die "$0: new CDB_File failed: $!\n";
while (<>) {
next if /^$/ or /^#/;
chop;
($k, $v) = split /:/, $_, 2;
if (defined $v) {
$cdb->insert($k, $v);
} else {
warn "bogus line: $_\n";
}
}
$cdb->finish or die "$0: CDB_File finish failed: $!\n";
3. Perl version of cdbdump.
use CDB_File;
tie %data, 'CDB_File', $ARGV[0] or
die "$0: can't tie to $ARGV[0]: $!\n";
while (($k, $v) = each %data) {
print '+', length $k, ',', length $v, ":$k->$v\n";
}
print "\n";
4. Although a cdb file is constant, you can simulate updating it in Perl. This is an
expensive operation, as you have to create a new database, and copy into it
everything that's unchanged from the old database. (As compensation, the
update does not affect database readers. The old database is available for
them, till the moment the new one is finished.)
use CDB_File;
$file = 'data.cdb';
$new = new CDB_File($file, "$file.$$") or
die "$0: new CDB_File failed: $!\n";
# Add the new values; remember which keys we've seen.
while (<>) {
chop;
($k, $v) = split;
$new->insert($k, $v);
$seen{$k} = 1;
}
# Add any old values that haven't been replaced.
tie %old, 'CDB_File', $file or die "$0: can't tie to $file: $!\n";
while (($k, $v) = each %old) {
$new->insert($k, $v) unless $seen{$k};
}
$new->finish or die "$0: CDB_File finish failed: $!\n";
A cdb file can contain repeated keys. If the insert method is called more than once with the same key during the creation of a cdb
file, that key will be repeated.
Here's an example.
$cdb = new CDB_File ("$file.cdb", "$file.$$") or die ...;
$cdb->insert('cat', 'gato');
$cdb->insert('cat', 'chat');
$cdb->finish;
Normally, any attempt to access a key retrieves the first value stored under that key. This code snippet always prints gato.
$catref = tie %catalogue, CDB_File, "$file.cdb" or die ...;
print "$catalogue{cat}";
However, all the usual ways of iterating over a hash---keys,
values, and each---do the Right Thing, even in the presence of repeated keys. This code
snippet prints cat cat gato chat.
print join(' ', keys %catalogue, values %catalogue);
Internally, CDB_File stores extra information to keep track of where it is while iterating over
a file. But this extra information is not attached to multiple keys
returned by keys: if you use them to retrieve values, they will always retrieve the first
value stored under that key.
This means that this code probably doesn't do what you want; it prints cat:gato cat:gato.
foreach $key (keys %catalogue) {
print "$key:$catalogue{$key} ";
}
The correct version uses each, and prints cat:gato cat:chat.
while (($key, $val) = each %catalogue) {
print "$key:$val ";
}
In general, there is no way to retrieve all the values associated with a
key, other than to loop over the entire database (i.e. there is no
equivalent to DB_File's get_dup method). However, the
multi_get method retrieves the values associated with the first occurrence of a key,
and all consecutive identical keys. It returns a reference to an array
containing all the values. If you ensure that all occurrences of each key
are adjacent in the database (perhaps by
sorting them during database creation), then multi_get can be used to retrieve all the values associated with a key. This code
prints
gato chat.
print "@{$catref->multi_get('cat')}";
tie, new, and finish return false if the attempted operation failed; $! contains the reason for failure.
use CDB_File to access something that isn't a cdb
file.
The Perl interface to cdb imposes the restriction that data must fit into memory.
cdb(3).