Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter email list to not show unsubscribe emails. #1513

Open
gs0510 opened this issue Apr 13, 2021 · 16 comments · May be fixed by #1558
Open

Filter email list to not show unsubscribe emails. #1513

gs0510 opened this issue Apr 13, 2021 · 16 comments · May be fixed by #1558
Labels
medium More Complex Issues for Outreachy

Comments

@gs0510
Copy link

gs0510 commented Apr 13, 2021

On the https://ocaml.org/community/ page, the recent email threads show all emails sent to the list. Filter the list so that unsubscribe emails are not displayed.

Screenshot from 2021-04-13 13-37-16

@gs0510
Copy link
Author

gs0510 commented Apr 13, 2021

@Ndipbanyan since you were looking for a medium issue, you can go ahead and work on this one.

@Ndipbanyan
Copy link
Contributor

@gs0510 Alright. Thank you. I will begin working on it and reach out for any help or clarifications that I might need.

@patricoferris patricoferris added the medium More Complex Issues for Outreachy label Apr 13, 2021
@gs0510
Copy link
Author

gs0510 commented Apr 15, 2021

@Ndipbanyan Have you been able to make any progress? Do you have any questions? Thanks!

@Ndipbanyan
Copy link
Contributor

@gs0510 I have been able to find the code that generates this list in the rss2.html in the script directory and I am trying to understand the function that does that to see if I can modify it to filter the list. So the drawback I am currently having is my little to lack of understanding of the Ocaml language. However, I am still going through tutorials to catch up.

@gs0510
Copy link
Author

gs0510 commented Apr 15, 2021

okay, let me know if you run into any problems! Thanks!

@Ndipbanyan
Copy link
Contributor

@gs0510 So I came up with a solution and want to clear be about it before creating a PR. Let me try to explain- The api that is 'consumed' to display the emails in recent thread emails returns a result having items in which each item has a title tag which reflects the subject of each email and the email of the sender. Below is what I am referring to
Screenshot 2021-04-18 at 17 55 55

generated from https://sympa.inria.fr/sympa/rss/latest_arc/caml-list?count=40

Looking at the above, you will notice that the item with the title <title>[Caml-list] - [email protected]</title> has its email subject as "[Caml-list]", item with title <title>[Caml-list] [CFP] Logical Frameworks and Meta-Languages: Theory and Practice - [email protected]</title> has its email subject as "[Caml-list] [CFP] Logical Frameworks and Meta-Languages: Theory and Practice" and item with the title <title>[Caml-list] unsubscribe - [email protected]</title> has "[Caml-list] unsubscribe" as its subject.
Now in the code base in the /script/rss2html.ml , line 595 contains a regex expression that is written to exlude "Re:" and anything in between [ ] which was used to match the subject(represented in between the <title> </title> tags). Doing this results to the [Caml-list] and [CFP] removed from the above "titles" leaving only the remaining part of the titles to be displayed. so in the case of <title>[Caml-list] - [email protected]</title>, there isn't any title after the [Caml-list] has been replaced/removed so the email - [email protected] is displayed. Going by all these, my implementation added the unsubscribe to the regex which will end up displaying <title>[Caml-list] unsubscribe - [email protected]</title> as "- [email protected]" in recent thread emails.

This has become rather too long :). However, the point of all my explanations is to be sure if my implementation is the way it should be or you mean an entirely different thing. Thank you for taking time in helping me with this.

@gs0510
Copy link
Author

gs0510 commented Apr 19, 2021

HI @Ndipbanyan! You are almost right :) We don't want to display the threads that say unsubscribe on the email feed and not remove unsubscribe from the title. What the function normalize_title is just normalizing titles (so removing the [CFP] etc etc.). What we want to do is remove the unsubscribe post from the posts list, so you can parse the list to see if there's a post with unsubscribe in it's title and remove that from the list. Hope this helps!

Let me know if anything is unclear, or if there's anything OCaml related that you don't understand :)

@Ndipbanyan
Copy link
Contributor

Thank you @gs0510 for the clarity. I will look into implementing this and let you know when I run into any issue understanding anything. Thanks

@Ndipbanyan
Copy link
Contributor

Ndipbanyan commented Apr 23, 2021

@gs0510 I have been having issues in trying to run make or make production since I installed the ocaml platform extension on vscode. Below was the error I was getting
Screenshot 2021-04-23 at 07 50 09
I uninstalled the extension then the cohttp-server-lwt ./ocaml.org wouldn't start anymore and running make gives the below error
Screenshot 2021-04-23 at 08 50 02

Please can you help me detect what the problem is?

@gs0510
Copy link
Author

gs0510 commented Apr 23, 2021

@Ndipbanyan Both errors are related to omd. Can you run opam show omd to see what version of omd you have?

cohttp-server-lwt ./ocaml.org will work only if your make command is successful.

@Ndipbanyan
Copy link
Contributor

After runnning opam show omd I got this
Screenshot 2021-04-23 at 12 12 03

@gs0510
Copy link
Author

gs0510 commented Apr 23, 2021

The website doesn't work with the latest version of OMD, see issue #1321, you need to downgrade omd to 1.3.1 and it should be okay after that :)

@Ndipbanyan
Copy link
Contributor

Yes! It works now. Thanks. Got me stuck there for a while.

@Ndipbanyan
Copy link
Contributor

Ndipbanyan commented Apr 23, 2021

Also I think I have been able to filter the emails now. My implementation is thus:-
I wrote a regex (for the unsubscribe word) and added an else if block in the must_keep function to exclude any post whose title matches the regex. Is this implementation okay?

Before:

Screenshot 2021-04-23 at 15 01 24

After:

Screenshot 2021-04-23 at 15 02 39

Code snippet (lines 592 and 614)

Screenshot 2021-04-23 at 15 05 46

@gs0510
Copy link
Author

gs0510 commented Apr 24, 2021

This looks good @Ndipbanyan, you can make the regex case agnostic so that all kinds of unsubscribes are filtered out. You should also open a PR. :)

@Ndipbanyan Ndipbanyan linked a pull request Apr 24, 2021 that will close this issue
4 tasks
@Ndipbanyan
Copy link
Contributor

Great! I've opened a PR. I used Str.regexp_case_fold as opposed to just Str.regexp so I believe that makes it case agnostic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
medium More Complex Issues for Outreachy
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants