The LMS Toolkit provides utilities called "extractors" that fetch data from your Learning Management System (LMS).

These extractors may be used as part of a larger "pipeline" that moved data from your LMS to your ODS. For how to information on that second step, see LMS Toolkit User Guide - Load Data into the ODS

Step 1: Create an Extractor Configuration File

The first step in the pipeline is the extractor. Extractors connect to the API (or other applicable interface) of the LMS and extract the data.

To run an extractor, you need to provide the extractor the information needed to connect to your LMS. The simplest way to do this is to put this data into a file named ".env" that sits in the same directory as the extractor. The LMS Toolkit gives you a sample file named ".env.example" to work with. You can simply rename this file to get started. You can find those files at this location (use the link that corresponds to your LMS platform; [INSTALL DIRECTORY] is where the LMS Toolkit is installed):

    [INSTALL DIRECTORY]/LMS-Toolkit/src/canvas-extractor/.env.example

    [INSTALL DIRECTORY]/LMS-Toolkit/src/google-classroom-extractor/.env.example

    [INSTALL DIRECTORY]/LMS-Toolkit/src/schoology-extractor/.env.example

    Make a copy of this file and name it .env. Leave it in the same directory as the ".env.example" file.

    The file for Canvas LMS would look like this:

    .env file
    FEATURE=[activities, attendance, assignments, grades]

    Each .env file has 2 kinds of information in it:

    1. Configuration info that is SPECIFIC to the LMS you are trying to connect to. 
    2. Configuration info that is common to ALL LMS toolkit extractors

    In the sections below  we cover how to complete both of these sets of configuration options.

    Step 2: Add LMS-specific Configuration To Your Config File

    In the config file, you now need to fill out the elements specific to your LMS.

    Consult the section below to find the configuration options specific to your LMS

      Canvas requires these four connection configuration parameters:

      Parameter name What to put there


      This points to the default URL for the entire canvas deployment, which might be something like "" or "myschooldistrict.edy/canvas/2021" etc.
      CANVAS_ACCESS_TOKEN This is the Canvas API access token that you retrieved from the Canvas administration panel.
      Start Date and End Date are the dates between which you want to pull data assignment, activity and other data, We recommend by default that these dates span a semester or similar school calendar timespan.

      Google Classroom requires this one connection configuration parameters:

      Parameter name What to put there

      This is the email address of the Google Classroom admin account.

      Schoology requires these two connection configuration parameters:

      Parameter name What to put there
      This is the Schoology API key that you retrieved from the Schoology administration panel.
      This is the Schoology API secret value that you retrieved from the Schoology administration panel.

      Now you should have a .env config file that is partially filled out.

      Step 3: Test the Connection to the LMS

      At this point, you should have a .env file has the info it needs to connect to the LMS.

      Next remove all the lines from the .env file that you did not edit above (lines that are not-LMS specific). After doing that, you should have a file that looks like this (click on the link corresponding to your LMS):



        Next, open a command line tool (Windows command, PowerShell, etc.) and navigate to the root director for the extractor. Install the tool's dependencies before you execute the tool for the first time.

          cd [INSTALL DIRECTORY]/LMS-Toolkit/src/canvas-extractor/
          # One-time execution:
          poetry install

          cd [INSTALL DIRECTORY]/LMS-Toolkit/src/google-classroom-extractor/
          # One-time execution:
          poetry install

          cd [INSTALL DIRECTORY]/LMS-Toolkit/src/schoology-extractor/ 
          # One-time execution:
          poetry install

          Finally, run the command that corresponds to your extractor 

            poetry run python edfi_canvas_extractor

            poetry run python edfi_google_classroom_extractor

            poetry run python edfi_schoology_extractor

            You should see data flowing and lines that look like the following appear.

            2021-09-03 17:45:41,185 - INFO - edfi_canvas_extractor.extract_facade - Starting Ed-Fi LMS Canvas Extractor
            2021-09-03 17:45:41,213 - INFO - edfi_canvas_extractor.extract_facade - Extracting Courses from Canvas API
            2021-09-03 17:45:41,213 - INFO - - Pulling course data
            2021-09-03 17:45:43,925 - INFO - edfi_canvas_extractor.extract_facade - Extracting Sections from Canvas API
            2021-09-03 17:45:43,925 - INFO - edfi_canvas_extractor.api.sections - Pulling section data
            2021-09-03 17:46:08,866 - INFO - edfi_canvas_extractor.extract_facade - Writing LMS UDM Sections to CSV file
            2021-09-03 17:46:08,870 - INFO - edfi_lms_extractor_lib.csv_generation.write - Generated file => c:\source\ed-fi\lms-data\canvas\sections\2021-09-03-17-46-08.csv

            Congratulations: you are on your way to using LMS data!

            It may take a bit of time to complete this run, as you are now extracting basic data from your LMS on users and sections. if your school district is large, that might be tens of thousands of records, and will take some time to complete.

            In the next steps, we expand the config file to tell the extractor what LMS data we need, and decide if we need to do other optional configuration - keep reading to learn more.

            If there are errors...

            Read the error messages and copy them down in case you need to share them with others. The  most likely causes of errors are:

            • The configuration information is incorrectly entered. Double check the info in the .env file to be sure it was copied correctly
            • There is some network configuration that is blocking access from the machine you are on to the LMS. To resolve this item, you will need to consult with a network person at your organization.

            Step 4: Add General Configuration to your Config File

            Now that the extractor is working, you need to tell it what data you need, and decide if you need to do other optional configuration.

            The most important configuration is to tell the extractor what data you need. Data is organized into domains or "features". To set this up, add a line to your .env file that looks like this:


            This line tells the extractor that you want all data related to assignments. 

            These are the features available. Please note that some are under development/experimental (see column 3) and that the Toolkit only extracts some of the LMS data in this area.

            Feature Experimental?Description
            assignmentsNoExtracts data related to student assignments and student assignment submissions.
            activitiesNoExtracts data related to student activities - logins/logouts (if available), discussion forum posts, etc. 
            gradesYESIf the LMS has a gradebook integrated, extracts grades
            attendanceYESIf the LMS has an attendance system integrated, extracts attendance data. Note: this is not a value calculated by he LMS Toolkit and does not attempt to derive if the student was in attendance or not! This is literally only what is in the LMS attendance system

            To request multiple features, you can separate them with commas

            FEATURE=[assignments, activities]

            You may have questions about what exact data is included. That information is LMS-specific. If you need that info, you can consult: LMS Unifying Data Model#AdditionalMappingNotes

            At this point, we recommend that you these two lines to your .env file - this tells the extractor to gather assignments data.


            This second line just tells the extractor to reduce the amount of text it spews out, if effect saying: "just tell me when things go wrong."

            You should now have a file that looks like this (click on the link corresponding to your LMS).




              Now run the extractor using the same command you used above. You should see data flowing again!

              It may take some time to complete this run, as you are now extracting data on ALL assignments and ALL  assignment submissions. If your school district is quite large, this could take a while. 

              Step 5: Verify Your Data

              The fist step to verifying that the extractor got the data is to look for errors during the extractor run. Look and see if any lines that start with "Warning" or "Error" appear. There.

              A further inspection is to look in the filesystem for the files that the extractor created. By default, the data is extracted to a directory named "data" in the same directory where the extractor lives. If you open up that directory, you should see something that looks like this:

              The presence of a directory with a name denotes that the extractor is attempting to gather that class of data. For example, in the screenshot above you can see many "assignment=*" folders under a "section=*" folder: this is the extractor gathering data on individual assignments for a section.

              If extracting data from the LMS at a large school district, your filesystem should contain tens of thousands of such files.

              Step 6: Set any Additional Configuration Options (optional)

              Below are listed all the possible configuration options and which extractor each applies to. As you prepare the extractor for regular usage, you may choose to tune some of these options.

              When running with the command line tool, you can also provide the --help  option to get the full set of options for that extractor.

              Applies ToArgumentRequired?Purpose

              Define which optional features are to be retrieved from the upstream system. Default: none. Available features:

              • Activities : encompassing section activities and system activities.
              • Attendance: attendance data. Only applies to Schoology.
              • Assignments: encompassing assignments and submissions.
              • Grades: section-level grades (assignment grades are included on the submissions resource). Experimental and only implemented for Canvas at this time.

              Note: Sections, Section Associations, and Users are always pulled from the Source System.

              Log LevelNoValid options are: DEBUG, INFO (default), WARNING, ERROR, CRITICAL
              Output DirectoryNoThe output directory for the generated CSV files. Defaults to: ./data.
              Sync database directoryNoDirectory for storing a SQLite database that is used in support of synchronizing the data between successive executions of the tool. Defaults to: ./data.
              Google ClassroomClassroom accountYesThe email address of the Google Classroom admin account.
              Usage start dateNoStart date for usage data pull in YYYY-MM-DD format.
              Usage end dateNoEnd date for usage data pull in YYYY-MM-DD format.
              SchoologyClient keyYesSchoology client key.
              Client secretYesSchoology client secret.
              Page sizeNoPage size for the paginated requests. Defaults to: 200. Max value: 200.
              Input directoryNoInput directory for usage CSV files.
              CanvasBase URLYesThe Canvas API base url.
              Access tokenYesThe Canvas API access token
              Start DateYesStart date for the range of classes and events to include, in YYYY-MM-DD format.
              End DateYesEnd date for the range of classes and events to include, in YYYY-MM-DD format.


              Are you a software developer familiar with Python?

              If you are, you can likely skim (or skip) much of this material and go directly to the README files in  the GitHub repo! See for the best starting point!

              • No labels