Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use case #4

Open
gedw99 opened this issue Aug 14, 2023 · 1 comment
Open

Use case #4

gedw99 opened this issue Aug 14, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@gedw99
Copy link

gedw99 commented Aug 14, 2023

I am struggling to understand the description at https://github.com/koddr/json2csv#-solving-case

could you add a real world use case with example in a folder ?

I am intrigued be Ayse I do lots of work with csv and json.

i don’t get the reasons of intend or word filters or why you add json I to the first column of the csv etc

so def need a real use case example !!

Hope that’s ok ..

@gedw99 gedw99 added the enhancement New feature or request label Aug 14, 2023
@koddr koddr added the good first issue Good for newcomers label Aug 15, 2023
@koddr
Copy link
Owner

koddr commented Aug 15, 2023

Hi,

Thanks for interesting! OK, let me try to describe a more real-world use case of json2csv. Therefore, I will describe exactly the case that led me to write this parser 😊

I had about 800k (~2.2 GB in zip archive) JSON files as input, where each one contained structured content in that format:

[
   {
      "chat_uuid": "***",
      "message_uuid": "***",
      "assigned_team": null,
      "who": "user",
      "created_at": "2022-09-08-08T08:30:46.109596+00:00",
      "type": "botrequest",
      "content": "/start"
   },
   {
      "chat_uuid": "***",
      "message_uuid": "***",
      "assigned_team": null,
      "who": "user",
      "created_at": "2022-09-08-08T08:50:10.110780+00:00",
      "type": "botrequest",
      "content": "Hello! I need course."
   },
   {
      "chat_uuid": "***",
      "message_uuid": "***",
      "assigned_team": "Forced Sessions",
      "who": "operator",
      "created_at": "2022-09-08-08T11:04:12.682817+00:00",
      "type": "botstate",
      "content": "Good afternoon, my name is Daniel. You were interested in the courses of educational platform. We are ready to tell you more about the courses.\nWhat direction is most interesting?"
   },

   ...
]

And I needed to:

  1. Select from these files only those objects that have "who": "user";
  2. Throw out from this selection some objects, e.g., where "content": "/start";
  3. Perform a quick search for string occurrences by certain parameters;
  4. Save the resulting objects in CSV format;
  5. Break these CSVs into smaller parts, e.g., 1000 lines per file;

Then, this was sent to ML specialists to customize the model for artificial intelligence.

In other words, I had a lot of raw data, I needed to prepare it for future use in a specific format. This is exactly the task that json2csv solves.

I hope it is now more or less clear how to apply it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants