COP 3503 PROJECT 3 INPUT FILES There are two kinds of files you will encounter for Project 3: Recipe Files and Inventory Files. I will modify the rules to make it easier to parse (get rid of some options), though you may parse the full version for extra credit. In what follows, I will give the restricted version so you will know what you must be able to handle. For the most part, this involves forcing the XML file to follow certain conventions. All these files will have the .xml extension, and will start with an XML header that looks like this: After these two tags, you will get to the meat of the matter. In what follows, I will use caps for CFG variables, and put literals in quotes. For example, the CFG productions S -> "Hello " V ("!")* V -> "Kitty" | "World" produce the sentences Hello Kitty Hello Kitty! Hello Kitty!! etc. and Hello World Hello World! Hello World!! Hello World!!! etc. I will use e for the empty string. Here is the grammar with explanation. START -> COOKBOOK | INVENTORY INVENTORY -> "" (INGREDIENT_LIST | EQUIPMENT_LIST)+ "" Recipe Files encode a cookbook, so COOKBOOK is the start symbol for them, while inventory files consist of equipement and/or ingredient lists, so INVENTORY will be the start symbol for them. EQUIPMENT_LIST -> "" EQUIPS "" EQUIPS -> EQUIPMENT | EQUIPMENT EQUIPS Equipment list is one or more pieces of equipment. COOKBOOK -> "" (TITLE | e) S_OR_R_LIST "" This means the cookbook may or may not have a title, followed by a list of sections and/or recipes, but it always starts and ends with the tags "" and "". TITLE -> "" (R_CHAR)* " R_CHAR -> ALPHA | DIGIT | WHITESPACE | '_' | '"' | '/' | PUNCT ALPHA -> alphabet characters DIGIT -> numeric characters WHITESPACE -> whitespace characters (space, tab, return) PUNCT -> ',' | '.' | ':' | '(' | ')' | '!' | '@' | '#' | '$' | '%' | '^' | '&' | '*' | '_' | '-' | '=' | '+' | '[' | ']' | '\' | '{' | '}' | '/' | '?' | '~' Note that '<' and '>' are NOT included in PUNCT S_OR_R_LIST -> (SECTION | RECIPE) | (SECTION | RECIPE) S_OR_R_LIST This means there must be at least one section or recipe, but there may be any number of them in any order. SECTION -> "
" (TITLE | e) RECIPE_LIST "
" So a section optionally has a title, followed by a recipe list. RECIPE_LIST -> RECIPE | RECIPE RECIPE_LIST Again, a recipe list must have at least one recipe, but maybe more. RECIPE -> "" TITLE INGREDIENT_LIST PREPARATION "" So a recipe is bookended with the , tags, with mandatory title, ingredient list, and preparation. INGREDIENT_LIST -> "" INGREDIENTS "" INGREDIENTS -> INGREDIENT | INGREDIENT INGREDIENTS Here, I require there to be at least one ingredient in the list (the dtd file does not). Again, the list is bookended with the proper tags. INGREDIENT -> "" FOODITEM MODS "" INGREDIENT -> "" QUANTITY FOODITEM MODS "" INGREDIENT -> "" QUANTITY UNITS FOODITEM MODS "" There are three possibilities for a fooditem - it must have a name describing the actual ingredient, it may have a quantity, and if it has a quantity, it may also have units. I have also restricted the order in which these must appear to be the "natural" order, and restricted the modifying text outside the tags to the end (so you can have 3 cups greenbeans, washed and trimmed, but not washed and trimmed greenbeans, 3 cups). MODS -> RCHAR* This just lets the mods be pretty much any text that doesn't look like a tag. FOODITEM -> "" RCHARS+ "" QUANTITY -> "" NUMBER "" UNITS -> "" RCHARS+ "" We don't try to interpret what the food item or the units are here, but we do require that the quantity be a number. NUMBER -> INTEGER | FLOAT | FRACTION INTEGER -> DIGIT+ FLOAT -> (INTEGER | e) "." (INTEGER) FRACTION -> (INTEGER | e) " " INTEGER "/" INTEGER We require a number to have one or more digits, optionally with decimal point and fractional part, or optionally with fraction expressed as numerator/denominator. Handling fractions is OPTIONAL for the project. PREPARATION -> "" STEP_LIST "" STEP_LIST -> STEP | STEP STEP_LIST STEP -> "" STEPTEXT "" STEPTEXT -> RCHARS (EQUIPMENT (RCHARS | e))* EQUIPMENT -> "" RCHARS "" Preparation now must be in the form of step. Each step must have some text and optionally has equipment and more text. START -> COOKBOOK | INVENTORY INVENTORY -> "" (INGREDIENT_LIST | EQUIPMENT_LIST)+ "" EQUIPMENT_LIST -> "" EQUIPS "" EQUIPS -> EQUIPMENT | EQUIPMENT EQUIPS EQUIPMENT -> "" RCHARS "" COOKBOOK -> "" (TITLE | e) S_OR_R_LIST "" TITLE -> "" (R_CHAR)* " S_OR_R_LIST -> (SECTION | RECIPE) | (SECTION | RECIPE) S_OR_R_LIST SECTION -> "
" (TITLE | e) RECIPE_LIST "
" RECIPE_LIST -> RECIPE | RECIPE RECIPE_LIST RECIPE -> "" TITLE INGREDIENT_LIST PREPARATION "" INGREDIENT_LIST -> "" INGREDIENTS "" INGREDIENTS -> INGREDIENT | INGREDIENT INGREDIENTS INGREDIENT -> "" FOODITEM MODS "" INGREDIENT -> "" QUANTITY FOODITEM MODS "" INGREDIENT -> "" QUANTITY UNITS FOODITEM MODS "" MODS -> RCHAR* FOODITEM -> RCHARS+ QUANTITY -> NUMBER UNITS -> RCHARS+ NUMBER -> INTEGER | FLOAT | FRACTION INTEGER -> DIGIT+ FLOAT -> (INTEGER | e) "." (INTEGER) FRACTION -> (INTEGER | e) " " INTEGER "/" INTEGER PREPARATION -> "" STEP_LIST "" STEP_LIST -> STEP | STEP STEP_LIST STEP -> "" STEPTEXT "" STEPTEXT -> RCHARS (EQUIPMENT (RCHARS | e))* R_CHAR -> ALPHA | DIGIT | WHITESPACE | '_' | '"' | '/' | PUNCT ALPHA -> alphabet characters DIGIT -> '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0' WHITESPACE -> whitespace characters (space, tab, return) PUNCT -> ',' | '.' | ':' | '(' | ')' | '!' | '@' | '#' | '$' | '%' | '^' | '&' | '*' | '_' | '-' | '=' | '+' | '[' | ']' | '\' | '{' | '}' | '/' | '?' | '~'