Updating the moma database Basic steps: 1. Save the existing database under a new database name a. Create a new database called moma2018 b. Dump the existing moma database using mysqldump c. Load the dump of the old database into the new database 2. Build the new moma database a. Download the data from moma.com b. Build the produced table 1. Create a temporary table to hold the correspondences between artists and artworks. 2. Save the contents of the temporary table to a local file 3. Flatten the file using Moma.java (did.csv) 4. Create the produced table 5. Load the produced table from did.csv c. Create the artists and artworks tables d. Load data from Artists.csv and Artworks.csv into the artists and artworks tables Details: 1. Save the existing database under a new database name I was able to do this with the utility program mysqldump. I ran it on my mac. a. First, I created a new database called moma2018 using phpmyadmin b. To dump the current database into a local file, called momadump.sql: mysqldump -h sql.cs.oberlin.edu -u jd -p moma > momadump.sql c. To load the contents of momadump.sql into the new database: mysql -h sql.cs.oberlin.edu -u jd -p moma2018 < momadump.sql (momadump.sql can be deleted once everything is finished and tested.) 2. Build the new moma database a. Download the files Artists.csv and Artworks.csv from the MoMA website. b. Create and populate a produced table which is in effect a table of intersection records between artists and artworks. 1. Run maketemp.sql and loadtemp.sql to load the correspondences between artist and artwork ids from the artworks.csv file. temp has two columns: cids (each row has a comma-separated list of artist ids) and objectid. Initially I had some trouble with this, as the local_infile option was disabled on the server. I fixed it using phpmyadmin by going to "variables" and setting "local infile" to ON. If there is still trouble, it may be at the client. If that's the case, use the option --local-infile=1 on the command line when running loadtemp.sql 2. Run storedid.sql to save the contents of moma.temp. Use the command mysql -N -h sql.cs.oberlin.edu -u jd -p < storedid.sql > did.dat The output format is 345,456,567|234 where 345, 456, and 567 are artist cids, and 234 is the object id 3. Run the Moma.java program to flatten did.dat into did.csv 4. Run makeproduced.sql to create the produced table. 5. Run loadproduced.sql to load the produced table from did.csv. c. Recreate the artists and artworks tables mysql -h sql.cs.oberlin.edu -u jd -p < makeartists.sql mysql -h sql.cs.oberlin.edu -u jd -p < makeartworks.sql d. Load data from Artists.csv and Artworks.csv into the artists and artworks tables mysql -h sql.cs.oberlin.edu --local-infile=1 -u jd -p < loadarists.sql mysql -h sql.cs.oberlin.edu --local-infile=1 -u jd -p < loadartworks.sql All done!! TODO: fix some of the datatypes: some of the varchar datatypes are not long enough the numeric datatype for the various dimensions needs to have a number of decimal places > 0