This work focuses on the lab exercises based on dc, transient and ac analysis in electric circuits found in introductory physics courses, and shows how students can use the simulation software to do simple activities associated with a lab exercise itself and with related topics. The introduction of electric circuit simulation software for undergraduate students in a general physics course is proposed in order to contribute to the constructive learning of electric circuit theory. We’ll add them next with - you guessed it - the GitHub REST API! Augmenting the datasetĪs shown in the following screenshot, the comments associated with an issue or pull request provide a rich source of information, especially if we’re interested in building a search engine to answer user queries about the library.Basic guidelines to introduce electric circuit simulation software in a general physics course For bonus points, calculate the average time it takes to close pull requests.Īlthough we could proceed to further clean up the dataset by dropping or renaming some columns, it is generally a good practice to keep the dataset as “raw” as possible at this stage so that it can be easily used in multiple applications.īefore we push our dataset to the Hugging Face Hub, let’s deal with one thing that’s missing from it: the comments associated with each issue and pull request. You may find the Dataset.filter() function useful to filter out the pull requests and open issues, and you can use the t_format() function to convert the dataset to a DataFrame so you can easily manipulate the created_at and closed_at timestamps. ✏️ Try it out! Calculate the average time it takes to close issues in □ Datasets. We can use this distinction to create a new is_pull_request column that checks whether the pull_request field is None or not: Here we can see that each pull request is associated with various URLs, while ordinary issues have a None entry. # Query with state=all to get both open and closed issues Num_pages = math.ceil(num_issues / per_page)īase_url = "" for page in tqdm( range(num_pages)): Per_page = 100 # Number of issues to return per page Whoa, that’s a lot of information! We can see useful fields like title, body, and number that describe the issue, as well as information about the GitHub user who opened the issue. This PR contains new updated GooAQ with train/val/test splits and updated README as well.', 'body': '() dataset was recently updated after splits were added for the same. 'node_id': 'MDExOlB1bGxSZXF1ZXN0NzEwNzUyMjc0', As shown in the following screenshot, at the time of writing there were 331 open issues and 668 closed ones. You can find all the issues in □ Datasets by navigating to the repository’s Issues tab. To keep things meta, we’ll use the GitHub issues associated with a popular open source project: □ Datasets! Let’s take a look at how to get the data and explore the information contained in these issues. Here we’ll focus on creating the corpus, and in the next section we’ll tackle the semantic search application.
0 Comments
Leave a Reply. |