coursera-dl/coursera-dl

View on GitHub

Showing 71 of 71 total issues

File api.py has 1220 lines of code (exceeds 250 allowed). Consider refactoring.
Open

# vim: set fileencoding=utf8 :
"""
This module contains implementations of different APIs that are used by the
downloader.
"""
Severity: Major
Found in coursera/api.py - About 3 days to fix

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

        def extract_links_from_reference(self, short_id):
            """
            Return a dictionary with supplement files (pdf, csv, zip, ipynb, html
            and so on) extracted from supplement page.
    
    
    Severity: Major
    Found in coursera/api.py and 1 other location - About 1 day to fix
    coursera/api.py on lines 1238..1282

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 179.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

        def extract_links_from_supplement(self, element_id):
            """
            Return a dictionary with supplement files (pdf, csv, zip, ipynb, html
            and so on) extracted from supplement page.
    
    
    Severity: Major
    Found in coursera/api.py and 1 other location - About 1 day to fix
    coursera/api.py on lines 1349..1392

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 179.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Function _parse_on_demand_syllabus has a Cognitive Complexity of 69 (exceeds 5 allowed). Consider refactoring.
    Open

        def _parse_on_demand_syllabus(self, course_name, page, reverse=False,
                                      unrestricted_filenames=False,
                                      subtitle_language='en',
                                      video_resolution=None,
                                      download_quizzes=False,
    Severity: Minor
    Found in coursera/extractors.py - About 1 day to fix

    Cognitive Complexity

    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

    A method's cognitive complexity is based on a few simple rules:

    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
    • Code is considered more complex for each "break in the linear flow of the code"
    • Code is considered more complex when "flow breaking structures are nested"

    Further reading

    Similar blocks of code found in 3 locations. Consider refactoring.
    Open

        def extract_links_from_programming(self, element_id):
            """
            Return a dictionary with links to supplement files (pdf, csv, zip,
            ipynb, html and so on) extracted from graded programming assignment.
    
    
    Severity: Major
    Found in coursera/api.py and 2 other locations - About 1 day to fix
    coursera/api.py on lines 1135..1168
    coursera/api.py on lines 1204..1236

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 140.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 3 locations. Consider refactoring.
    Open

        def extract_links_from_programming_immediate_instructions(self, element_id):
            """
            Return a dictionary with links to supplement files (pdf, csv, zip,
            ipynb, html and so on) extracted from graded programming assignment.
    
    
    Severity: Major
    Found in coursera/api.py and 2 other locations - About 1 day to fix
    coursera/api.py on lines 1170..1202
    coursera/api.py on lines 1204..1236

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 140.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 3 locations. Consider refactoring.
    Open

        def extract_links_from_peer_assignment(self, element_id):
            """
            Return a dictionary with links to supplement files (pdf, csv, zip,
            ipynb, html and so on) extracted from peer assignment.
    
    
    Severity: Major
    Found in coursera/api.py and 2 other locations - About 1 day to fix
    coursera/api.py on lines 1135..1168
    coursera/api.py on lines 1170..1202

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 140.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

                    if not os.path.exists(self._course_name + "/notebook/" + head + "/" + tail):
                        logging.info(
                            'Downloading Jupyter %s into %s', tail, head)
                        with open(self._course_name + "/notebook/" + head + "/" + tail, 'wb+') as f:
                            f.write(r.content)
    Severity: Major
    Found in coursera/api.py and 1 other location - About 7 hrs to fix
    coursera/api.py on lines 692..697

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 112.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    Similar blocks of code found in 2 locations. Consider refactoring.
    Open

                    if not os.path.exists(self._course_name + "/notebook/" + head + "/" + tail):
                        logging.info('Downloading %s into %s', tail, head)
                        with open(self._course_name + "/notebook/" + head + "/" + tail, 'wb+') as f:
                            f.write(r.content)
                    else:
    Severity: Major
    Found in coursera/api.py and 1 other location - About 7 hrs to fix
    coursera/api.py on lines 718..724

    Duplicated Code

    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

    Tuning

    This issue has a mass of 112.

    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

    Refactorings

    Further Reading

    File commandline.py has 418 lines of code (exceeds 250 allowed). Consider refactoring.
    Open

    """
    This module contains code that is related to command-line argument
    handling. The primary candidate is argument parser.
    """
    
    
    Severity: Minor
    Found in coursera/commandline.py - About 6 hrs to fix

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

          def extract_links_from_quiz(self, quiz_id):
              try:
                  session_id = self._get_quiz_session_id(quiz_id)
                  quiz_json = self._get_quiz_json(quiz_id, session_id)
                  return self._convert_quiz_json_to_links(quiz_json, 'quiz')
      Severity: Major
      Found in coursera/api.py and 1 other location - About 4 hrs to fix
      coursera/api.py on lines 640..650

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 84.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

          def extract_links_from_exam(self, exam_id):
              try:
                  session_id = self._get_exam_session_id(exam_id)
                  exam_json = self._get_exam_json(exam_id, session_id)
                  return self._convert_quiz_json_to_links(exam_json, 'exam')
      Severity: Major
      Found in coursera/api.py and 1 other location - About 4 hrs to fix
      coursera/api.py on lines 779..789

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 84.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Function _extract_subtitles_from_video_dom has a Cognitive Complexity of 32 (exceeds 5 allowed). Consider refactoring.
      Open

          def _extract_subtitles_from_video_dom(self, video_dom,
                                                subtitle_language, video_id):
              # subtitles and transcripts
              subtitle_nodes = [
                  ('subtitles', 'srt', 'subtitle'),
      Severity: Minor
      Found in coursera/api.py - About 4 hrs to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      CourseraOnDemand has 36 functions (exceeds 20 allowed). Consider refactoring.
      Open

      class CourseraOnDemand(object):
          """
          This is a class that provides a friendly interface to extract certain
          parts of on-demand courses. On-demand class is a new format that Coursera
          is using, they contain `/learn/' in their URLs. This class does not support
      Severity: Minor
      Found in coursera/api.py - About 4 hrs to fix

        Function main has a Cognitive Complexity of 30 (exceeds 5 allowed). Consider refactoring.
        Open

        def main():
            """
            Main entry point for execution as a program (instead of as a module).
            """
        
        
        Severity: Minor
        Found in coursera/coursera_dl.py - About 4 hrs to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Identical blocks of code found in 2 locations. Consider refactoring.
        Open

                        if not os.path.isdir(self._course_name + "/notebook/" + head + "/"):
                            logging.info('Creating [%s] directories...', head)
                            os.makedirs(self._course_name + "/notebook/" + head + "/")
        Severity: Major
        Found in coursera/api.py and 1 other location - About 4 hrs to fix
        coursera/api.py on lines 686..688

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 75.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Identical blocks of code found in 2 locations. Consider refactoring.
        Open

                        if not os.path.isdir(self._course_name + "/notebook/" + head + "/"):
                            logging.info('Creating [%s] directories...', head)
                            os.makedirs(self._course_name + "/notebook/" + head + "/")
        Severity: Major
        Found in coursera/api.py and 1 other location - About 4 hrs to fix
        coursera/api.py on lines 712..714

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 75.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Function _start_download has a Cognitive Complexity of 27 (exceeds 5 allowed). Consider refactoring.
        Open

            def _start_download(self, url, filename, resume=False):
                # resume has no meaning if the file doesn't exists!
                resume = resume and os.path.exists(filename)
        
                headers = {}
        Severity: Minor
        Found in coursera/downloaders.py - About 3 hrs to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Function _get_notebook_folder has a Cognitive Complexity of 26 (exceeds 5 allowed). Consider refactoring.
        Open

            def _get_notebook_folder(self, url, jupyterId, **kwargs):
        
                supplement_links = {}
        
                url = url.format(**kwargs)
        Severity: Minor
        Found in coursera/api.py - About 3 hrs to fix

        Cognitive Complexity

        Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

        A method's cognitive complexity is based on a few simple rules:

        • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
        • Code is considered more complex for each "break in the linear flow of the code"
        • Code is considered more complex when "flow breaking structures are nested"

        Further reading

        Similar blocks of code found in 2 locations. Consider refactoring.
        Open

        @attr.s
        class ModuleV1(object):
            name = attr.ib()
            id = attr.ib()
            slug = attr.ib()
        Severity: Major
        Found in coursera/api.py and 1 other location - About 3 hrs to fix
        coursera/api.py on lines 450..458

        Duplicated Code

        Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

        Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

        When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

        Tuning

        This issue has a mass of 70.

        We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

        The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

        If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

        See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

        Refactorings

        Further Reading

        Severity
        Category
        Status
        Source
        Language