-
Notifications
You must be signed in to change notification settings - Fork 42
budget_format_info
The file names will have the following form. For the final data type in the sandboxes.
These are the files to use when working on new features. Any other formats found may be wrong!
CODE For OKC FY<year_of_budget> Final.csv
CODE For OKC FY<year_of_budget> Final.json
c4okc_<year_of_budget>.csv
c4okc_<year_of_budget>.json
The data have a natural hierarchy as shown in the tree map according to the following fields listed in order of precedence.
- Agency Name
- LOB name
- Program Name
- AccountDescription
- Program Name
While the general outline to parse the CSV data used in the open budget is given, it is not necessary to write your own parsers in order to get a json representation. These files are parsed when received, into the following form:
[{"agency": string
,"fund" : string
,"lob" : string
,"program" : string
,"key" : string
,"value" : <number-as-string>} ... ]
In this case, <number-as-string>
implies a valid number when parsed by javascript using JSON.parse.
Here is a bit of json as a sample...
- Notice the
key
field is theAccountDescription
.
[
{
"agency": "Mayor and Council",
"fund": "GENERAL FUND ",
"lob": "Mayors Office",
"program": "MAYOR",
"key": "BUDGET-VACANCY DISCOUNT ",
"value": "-3598"
},
{
"agency": "Mayor and Council",
"fund": "GENERAL FUND ",
"lob": "Mayors Office",
"program": "MAYOR",
"key": "SALARIES AND WAGES ",
"value": "170007"
},
{
"agency": "Mayor and Council",
"fund": "GENERAL FUND ",
"lob": "Mayors Office",
"program": "MAYOR",
"key": "PART TIME & TEMPORARY WAGES ",
"value": "14000"
},
{
"agency": "Mayor and Council",
"fund": "GENERAL FUND ",
"lob": "Mayors Office",
"program": "MAYOR",
"key": "HEALTH AND WELFARE INSURANCE ",
"value": "25052"
}];
The CSV rules are based loosely on rfc4180
.
However an important distinction is the removal of the requirement for CRLF
and instead just requiring
the LF
which has become more common.
Below are the exceprts that are especially important if you are wanting to implement your own parser
- Note the change from CRLF to LF
- Pay attention to the escaping rules for commas and quotes.
1. Introduction
The comma separated values format (CSV) has been used for exchanging
and converting data between various spreadsheet programs for quite
some time. Surprisingly, while this format is very common, it has
never been formally documented. Additionally, while the IANA MIME
registration tree includes a registration for
"text/tab-separated-values" type, no MIME types have ever been
registered with IANA for CSV. At the same time, various programs and
operating systems have begun to use different MIME types for this
format. This RFC documents the format of comma separated values
(CSV) files and formally registers the "text/csv" MIME type for CSV
in accordance with RFC 2048 [1].
2. Definition of the CSV Format
While there are various specifications and implementations for the
CSV format (for ex. [4], [5], [6] and [7]), there is no formal
specification in existence, which allows for a wide variety of
interpretations of CSV files. This section documents the format that
seems to be followed by most implementations:
1. Each record is located on a separate line, delimited by a new line
(LF). For example:
aaa,bbb,ccc LF
zzz,yyy,xxx LF
2. The last record in the file may or may not have an ending line
break. For example:
aaa,bbb,ccc LF
zzz,yyy,xxx
3. There maybe an optional header line appearing as the first line
of the file with the same format as normal record lines. This
header will contain names corresponding to the fields in the file
and should contain the same number of fields as the records in
the rest of the file (the presence or absence of the header line
should be indicated via the optional "header" parameter of this
MIME type). For example:
field_name,field_name,field_name LF
aaa,bbb,ccc LF
zzz,yyy,xxx LF
4. Within the header and each record, there may be one or more
fields, separated by commas. Each line should contain the same
number of fields throughout the file. Spaces are considered part
of a field and should not be ignored. The last field in the
record must not be followed by a comma. For example:
aaa,bbb,ccc
5. Each field may or may not be enclosed in double quotes (however
some programs, such as Microsoft Excel, do not use double quotes
at all). If fields are not enclosed with double quotes, then
double quotes may not appear inside the fields. For example:
"aaa","bbb","ccc" LF
zzz,yyy,xxx
6. Fields containing line breaks (LF), double quotes, and commas
should be enclosed in double-quotes. For example:
"aaa","b LF
bb","ccc" LF
zzz,yyy,xxx
7. If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
The ABNF grammar [2] appears as follows:
file = [header LF] record *(LF record) [LF]
header = name *(COMMA name)
record = field *(COMMA field)
name = field
field = (escaped / non-escaped)
escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE
non-escaped = *TEXTDATA
Informally, a little bit of sample csv looks like this:
FundCode ,FundDescription ,OperatingUnit ,OperatingUnitDescription ,Agency ,Agency Name ,ProgramID ,ProgramName ,LineOfBusinessID ,LOBName ,Account ,Character Class , Class Description ,Account Description ,Operating or Non ,FY2016 Budget
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,010 ,Mayor and Council ,0100101 ,MAYOR ,01001 ,Mayors Office ,51000004 ,51 , Personnel Services ,BUDGET-VACANCY DISCOUNT ,#NAME? ,-3598
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,010 ,Mayor and Council ,0100101 ,MAYOR ,01001 ,Mayors Office ,51010001 ,51 , Personnel Services ,SALARIES AND WAGES ,#NAME? ,170007
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,010 ,Mayor and Council ,0100101 ,MAYOR ,01001 ,Mayors Office ,51010004 ,51 , Personnel Services ,PART TIME & TEMPORARY WAGES ,#NAME? ,14000
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,010 ,Mayor and Council ,0100101 ,MAYOR ,01001 ,Mayors Office ,51070001 ,51 , Personnel Services ,HEALTH AND WELFARE INSURANCE ,#NAME? ,25052
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,040 ,City Manager ,0400101 ,CITY MANAGER'S OFFICE ,04001 ,City Managers Office ,51000004 ,51 , Personnel Services , BUDGET-VACANCY DISCOUNT ,#NAME? ,-538
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,040 ,City Manager ,0400101 ,CITY MANAGER'S OFFICE ,04001 ,City Managers Office ,51010001 ,51 , Personnel Services , SALARIES AND WAGES ,#NAME? ,25440
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,040 ,City Manager ,0400101 ,CITY MANAGER'S OFFICE ,04001 ,City Managers Office ,51070001 ,51 , Personnel Services , HEALTH AND WELFARE INSURANCE ,#NAME? ,5011
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,040 ,City Manager ,0400101 ,CITY MANAGER'S OFFICE ,04001 ,City Managers Office ,51070004 ,51 , Personnel Services , RETIREMENT PENSION CONTRIB ,#NAME? ,1477
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,040 ,City Manager ,0400101 ,CITY MANAGER'S OFFICE ,04001 ,City Managers Office ,51510004 ,51 , Personnel Services , EMPLOYEE PARKING ,#NAME? ,404
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,040 ,City Manager ,0400101 ,CITY MANAGER'S OFFICE ,04001 ,City Managers Office ,52280004 ,52 , Professional Services , CHARGEBACK-IT ,#NAME? ,144890
001 ,GENERAL FUND ,0001 ,GENERAL OPERATIONS-UASN ,040 ,City Manager ,0400101 ,CITY MANAGER'S OFFICE ,04001 ,City Managers Office ,52280007 ,52 , Professional Services , CHARGEBACK-RISK MANAGEMENT ,#NAME? ,18905