Skip to content
This repository was archived by the owner on Jul 27, 2024. It is now read-only.

Conversation

nmdanny
Copy link
Collaborator

@nmdanny nmdanny commented Jun 28, 2021

Import feature

This feature allows a user to run a script within the "מידע אישי" site to scrape all his registered courses, and pass them
to the frontend. The script is executed by running a bookmark as explained on the import instructions page.

The script does not communicate with the back-end, rather, it communicates with the frontend page(locally, no network communication) by the use of cross-origin messaging API. This is why users are require to open HUJI's site via a pop-up.

This is good from several standpoints

  • Security-wise, the scrape script doesn't need authentication (which would've required embedding/passing auth credentials to the script, which is running in a potentially malicious origin)
  • UX: the results of the scrape are presented to the user in a table, allowing him to be aware of which courses cannot
    be added(missing from our database or summer/annual courses), and manually choose a semester for courses whose
    semester wasn't identified.
  • Separation of concerns/less code duplication: The front-end already has components for dealing with authentication
    and communication with the DB ($http and $auth), no need to embed them in the script as well

Other front-end changes

  • Use of a store object for holding courses, which should fix some difficulties regarding accessing the courses information from analytics view #96

  • The course store also saves to local storage whenever a change is made. In the future we should also save
    to DB

  • Some fixes and refactorings of ProgressBox (I'm guessing that feature is still WIP but it
    caused exceptions due to missing type, which would occur be if the student has no track assigned)

Future improvement ideas

  • Currently, the front-end makes a separate GET for each course(via the /course/?course_id=?? endpoint)
    for each scraped course to verify that said course exists in the DB(and to obtain its semester if it wasn't inferred
    from the personal information site)

    We should implement an endpoint that does all this at once (gets a list of course IDs and returns a list of courses),

  • Use proper, styled HTML elements instead of window.alert within the scrape script

    Done

  • Perhaps find a way to scrape the student's track? The grades page only seems to include his faculty, maybe this info
    appears somewhere else within "מידע אישי"

  • Implement Scrape course grade histogram from HUJI personal user information site #77, scrape grade statistics too

  • Paginate the parsed courses table which can get pretty long, and other UI improvements

  • Currently, when validating the existence of the parsed courses, I compare them to the Shnaton entries
    of 2021, because we didn't seem to parse other years yet. This can potentially cause mismatches(e.g, courses
    whose semester changed)

@nmdanny nmdanny requested review from rbenjos and noabarlia June 28, 2021 00:44
Copy link
Owner

@ScDor ScDor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work, please see the attached comments

@nmdanny nmdanny force-pushed the merge-scrape-user-courses branch 3 times, most recently from e9d66cf to 840e00a Compare July 4, 2021 00:34
nmdanny added 10 commits July 4, 2021 03:44
 Can now be injected in order to use within `setup()` functions
Though this feature is probably not complete, it currently errors if
the user's selection of courses doesn't include courses of a
certain type.

Furthermore, use of lifecycle hooks is not necessary, computed
properties are more appropriate here
Since the course DB only has courses from 2021 as of now,
and we don't have validation of takes anyway, there's no
harm in taking the semester detected by the scraper(using
the 'tkufa' in the statistics URL)
1. In case of a conflict between user choice and DB, respect user choice first

2. Annual/summer courses will be consideed as part of the first semester
Prefer semester in registration even if it clashes with
semester stored in DB
@nmdanny nmdanny force-pushed the merge-scrape-user-courses branch from 840e00a to e5a586d Compare July 4, 2021 01:10
nmdanny added 3 commits July 4, 2021 04:40
Course scrape script now utilizes modules,
and includes a vue UI
- bulma is now processed by bundler instead
  of being downloaded directly from CDN
- the bookmarklet entry point is now built separately
  from the site, as a library
@nmdanny nmdanny force-pushed the merge-scrape-user-courses branch from e5a586d to a4467bc Compare July 4, 2021 14:17
@ScDor
Copy link
Owner

ScDor commented Aug 26, 2021

Are there any changes required here due to the new data year mechanism, or can we merge?

@nmdanny nmdanny force-pushed the merge-scrape-user-courses branch from 44dc7b6 to 4977899 Compare August 26, 2021 13:57
@nmdanny
Copy link
Collaborator Author

nmdanny commented Aug 26, 2021

Are there any changes required here due to the new data year mechanism, or can we merge?

Made a tiny fix, should be good to go

@ScDor ScDor merged commit 9aded0b into master Aug 26, 2021
@ScDor ScDor deleted the merge-scrape-user-courses branch August 26, 2021 14:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

some difficulties regarding accessing the courses information from analytics view Parse student's past courses from HUJI website
2 participants