muneebalam/scrapenhl2

View on GitHub
scrapenhl2/scrape/scrape_pbp.py

Summary

Maintainability
F
3 days
Test Coverage

Function scrape_season_pbp has a Cognitive Complexity of 9 (exceeds 5 allowed). Consider refactoring.
Open

def scrape_season_pbp(season, force_overwrite=False):
    """
    Scrapes and parses pbp from the given season.

    :param season: int, the season
Severity: Minor
Found in scrapenhl2/scrape/scrape_pbp.py - About 55 mins to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

def save_raw_pbp(page, season, game):
    """
    Takes the bytes page containing pbp information and saves to disk as a compressed zlib.

    :param page: bytes. str(page) would yield a string version of the json pbp
Severity: Major
Found in scrapenhl2/scrape/scrape_pbp.py and 1 other location - About 5 hrs to fix
scrapenhl2/scrape/scrape_toi.py on lines 85..103

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 92.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 4 locations. Consider refactoring.
Open

        if interval_j < len(intervals):
            if i == intervals[interval_j][0]:
                print('Done scraping through {0:d} {1:d} ({2:d}%)'.format(
                    season, game, round(intervals[interval_j][0] / len(games) * 100)))
                interval_j += 1
Severity: Major
Found in scrapenhl2/scrape/scrape_pbp.py and 3 other locations - About 5 hrs to fix
scrapenhl2/scrape/parse_pbp.py on lines 36..40
scrapenhl2/scrape/parse_toi.py on lines 48..52
scrapenhl2/scrape/scrape_toi.py on lines 236..240

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 86.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

def scrape_game_pbp_from_html(season, game, force_overwrite=True):
    """
    This method scrapes the html pbp for the given game. Use for live games.

    :param season: int, the season
Severity: Major
Found in scrapenhl2/scrape/scrape_pbp.py and 1 other location - About 4 hrs to fix
scrapenhl2/scrape/scrape_pbp.py on lines 39..68

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 76.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

def scrape_game_pbp(season, game, force_overwrite=False):
    """
    This method scrapes the pbp for the given game.

    :param season: int, the season
Severity: Major
Found in scrapenhl2/scrape/scrape_pbp.py and 1 other location - About 4 hrs to fix
scrapenhl2/scrape/scrape_pbp.py on lines 14..36

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 76.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

def get_raw_pbp(season, game):
    """
    Loads the compressed json file containing this game's play by play from disk.

    :param season: int, the season
Severity: Major
Found in scrapenhl2/scrape/scrape_pbp.py and 1 other location - About 2 hrs to fix
scrapenhl2/scrape/scrape_toi.py on lines 147..158

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 60.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 4 locations. Consider refactoring.
Open

def scrape_pbp_setup():
    """
    Creates raw pbp folders if need be

    :return:
Severity: Major
Found in scrapenhl2/scrape/scrape_pbp.py and 3 other locations - About 1 hr to fix
scrapenhl2/scrape/parse_pbp.py on lines 355..362
scrapenhl2/scrape/parse_toi.py on lines 558..565
scrapenhl2/scrape/scrape_toi.py on lines 243..250

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 43.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 7 locations. Consider refactoring.
Open

def get_game_pbplog_filename(season, game):
    """
    Returns the filename of the parsed pbp html game pbp

    :param season: int, current season
Severity: Major
Found in scrapenhl2/scrape/scrape_pbp.py and 6 other locations - About 1 hr to fix
scrapenhl2/scrape/parse_pbp.py on lines 249..258
scrapenhl2/scrape/parse_toi.py on lines 524..533
scrapenhl2/scrape/scrape_pbp.py on lines 173..182
scrapenhl2/scrape/scrape_toi.py on lines 38..47
scrapenhl2/scrape/scrape_toi.py on lines 50..58
scrapenhl2/scrape/scrape_toi.py on lines 197..206

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 38.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 7 locations. Consider refactoring.
Open

def get_game_raw_pbp_filename(season, game):
    """
    Returns the filename of the raw pbp folder

    :param season: int, current season
Severity: Major
Found in scrapenhl2/scrape/scrape_pbp.py and 6 other locations - About 1 hr to fix
scrapenhl2/scrape/parse_pbp.py on lines 249..258
scrapenhl2/scrape/parse_toi.py on lines 524..533
scrapenhl2/scrape/scrape_pbp.py on lines 185..194
scrapenhl2/scrape/scrape_toi.py on lines 38..47
scrapenhl2/scrape/scrape_toi.py on lines 50..58
scrapenhl2/scrape/scrape_toi.py on lines 197..206

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 38.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

There are no issues that match your filters.

Category
Status