COP 3503 PROJECT 3
INPUT FILES
There are two kinds of files you will encounter for Project 3:
Recipe Files and Inventory Files. I will modify the rules to make it
easier to parse (get rid of some options), though you may parse the
full version for extra credit. In what follows, I will give the
restricted version so you will know what you must be able to handle.
For the most part, this involves forcing the XML file to follow certain
conventions.
All these files will have the .xml extension, and will start with an XML
header that looks like this:
After these two tags, you will get to the meat of the matter.
In what follows, I will use caps for CFG variables, and put literals
in quotes. For example, the CFG productions
S -> "Hello " V ("!")*
V -> "Kitty" | "World"
produce the sentences
Hello Kitty
Hello Kitty!
Hello Kitty!!
etc.
and
Hello World
Hello World!
Hello World!!
Hello World!!!
etc.
I will use e for the empty string.
Here is the grammar with explanation.
START -> COOKBOOK | INVENTORY
INVENTORY -> "" (INGREDIENT_LIST | EQUIPMENT_LIST)+ ""
Recipe Files encode a cookbook, so COOKBOOK is the start symbol for
them, while inventory files consist of equipement and/or ingredient
lists, so INVENTORY will be the start symbol for them.
EQUIPMENT_LIST -> "" EQUIPS ""
EQUIPS -> EQUIPMENT | EQUIPMENT EQUIPS
Equipment list is one or more pieces of equipment.
COOKBOOK -> "" (TITLE | e) S_OR_R_LIST ""
This means the cookbook may or may not have a title, followed by
a list of sections and/or recipes, but it always starts and ends
with the tags "" and "".
TITLE -> "
" (R_CHAR)* "
R_CHAR -> ALPHA | DIGIT | WHITESPACE | '_' | '"' | '/' | PUNCT
ALPHA -> alphabet characters
DIGIT -> numeric characters
WHITESPACE -> whitespace characters (space, tab, return)
PUNCT -> ',' | '.' | ':' | '(' | ')' | '!' | '@' | '#' | '$' | '%' |
'^' | '&' | '*' | '_' | '-' | '=' | '+' | '[' | ']' | '\' |
'{' | '}' | '/' | '?' | '~'
Note that '<' and '>' are NOT included in PUNCT
S_OR_R_LIST -> (SECTION | RECIPE) | (SECTION | RECIPE) S_OR_R_LIST
This means there must be at least one section or recipe, but there
may be any number of them in any order.
SECTION -> "" (TITLE | e) RECIPE_LIST ""
So a section optionally has a title, followed by a recipe list.
RECIPE_LIST -> RECIPE | RECIPE RECIPE_LIST
Again, a recipe list must have at least one recipe, but maybe more.
RECIPE -> "" TITLE INGREDIENT_LIST PREPARATION ""
So a recipe is bookended with the , tags, with
mandatory title, ingredient list, and preparation.
INGREDIENT_LIST -> "" INGREDIENTS ""
INGREDIENTS -> INGREDIENT | INGREDIENT INGREDIENTS
Here, I require there to be at least one ingredient in the list (the
dtd file does not). Again, the list is bookended with the proper tags.
INGREDIENT -> "" FOODITEM MODS ""
INGREDIENT -> "" QUANTITY FOODITEM MODS ""
INGREDIENT -> "" QUANTITY UNITS FOODITEM MODS ""
There are three possibilities for a fooditem - it must have a name
describing the actual ingredient, it may have a quantity, and if it
has a quantity, it may also have units. I have also restricted the
order in which these must appear to be the "natural" order, and
restricted the modifying text outside the tags to the end (so you can
have 3 cups greenbeans, washed and trimmed, but not washed and trimmed
greenbeans, 3 cups).
MODS -> RCHAR*
This just lets the mods be pretty much any text that doesn't look like a tag.
FOODITEM -> "" RCHARS+ ""
QUANTITY -> "" NUMBER ""
UNITS -> "" RCHARS+ ""
We don't try to interpret what the food item or the units are here, but we
do require that the quantity be a number.
NUMBER -> INTEGER | FLOAT | FRACTION
INTEGER -> DIGIT+
FLOAT -> (INTEGER | e) "." (INTEGER)
FRACTION -> (INTEGER | e) " " INTEGER "/" INTEGER
We require a number to have one or more digits, optionally with decimal
point and fractional part, or optionally with fraction expressed as
numerator/denominator. Handling fractions is OPTIONAL for the project.
PREPARATION -> "" STEP_LIST ""
STEP_LIST -> STEP | STEP STEP_LIST
STEP -> "" STEPTEXT ""
STEPTEXT -> RCHARS (EQUIPMENT (RCHARS | e))*
EQUIPMENT -> "" RCHARS ""
Preparation now must be in the form of step. Each step must have some text
and optionally has equipment and more text.
START -> COOKBOOK | INVENTORY
INVENTORY -> "" (INGREDIENT_LIST | EQUIPMENT_LIST)+ ""
EQUIPMENT_LIST -> "" EQUIPS ""
EQUIPS -> EQUIPMENT | EQUIPMENT EQUIPS
EQUIPMENT -> "" RCHARS ""
COOKBOOK -> "" (TITLE | e) S_OR_R_LIST ""
TITLE -> "" (R_CHAR)* "
S_OR_R_LIST -> (SECTION | RECIPE) | (SECTION | RECIPE) S_OR_R_LIST
SECTION -> "" (TITLE | e) RECIPE_LIST ""
RECIPE_LIST -> RECIPE | RECIPE RECIPE_LIST
RECIPE -> "" TITLE INGREDIENT_LIST PREPARATION ""
INGREDIENT_LIST -> "" INGREDIENTS ""
INGREDIENTS -> INGREDIENT | INGREDIENT INGREDIENTS
INGREDIENT -> "" FOODITEM MODS ""
INGREDIENT -> "" QUANTITY FOODITEM MODS ""
INGREDIENT -> "" QUANTITY UNITS FOODITEM MODS ""
MODS -> RCHAR*
FOODITEM -> RCHARS+
QUANTITY -> NUMBER
UNITS -> RCHARS+
NUMBER -> INTEGER | FLOAT | FRACTION
INTEGER -> DIGIT+
FLOAT -> (INTEGER | e) "." (INTEGER)
FRACTION -> (INTEGER | e) " " INTEGER "/" INTEGER
PREPARATION -> "" STEP_LIST ""
STEP_LIST -> STEP | STEP STEP_LIST
STEP -> "" STEPTEXT ""
STEPTEXT -> RCHARS (EQUIPMENT (RCHARS | e))*
R_CHAR -> ALPHA | DIGIT | WHITESPACE | '_' | '"' | '/' | PUNCT
ALPHA -> alphabet characters
DIGIT -> '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | '0'
WHITESPACE -> whitespace characters (space, tab, return)
PUNCT -> ',' | '.' | ':' | '(' | ')' | '!' | '@' | '#' | '$' | '%' |
'^' | '&' | '*' | '_' | '-' | '=' | '+' | '[' | ']' | '\' |
'{' | '}' | '/' | '?' | '~'