muneebalam/scrapenhl2

View on GitHub
scrapenhl2/scrape/scrape_toi.py

Summary

Maintainability
D
2 days
Test Coverage

Function scrape_season_toi has a Cognitive Complexity of 11 (exceeds 5 allowed). Consider refactoring.
Open

def scrape_season_toi(season, force_overwrite=False):
    """
    Scrapes and parses toi from the given season.

    :param season: int, the season
Severity: Minor
Found in scrapenhl2/scrape/scrape_toi.py - About 1 hr to fix

Cognitive Complexity

Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

A method's cognitive complexity is based on a few simple rules:

  • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
  • Code is considered more complex for each "break in the linear flow of the code"
  • Code is considered more complex when "flow breaking structures are nested"

Further reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

def save_raw_toi(page, season, game):
    """
    Takes the bytes page containing shift information and saves to disk as a compressed zlib.

    :param page: bytes. str(page) would yield a string version of the json shifts
Severity: Major
Found in scrapenhl2/scrape/scrape_toi.py and 1 other location - About 5 hrs to fix
scrapenhl2/scrape/scrape_pbp.py on lines 87..105

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 92.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 4 locations. Consider refactoring.
Open

        if interval_j < len(intervals):
            if i == intervals[interval_j][0]:
                print('Done scraping through {0:d} {1:d} ({2:d}%)'.format(
                    season, game, round(intervals[interval_j][0] / len(games) * 100)))
                interval_j += 1
Severity: Major
Found in scrapenhl2/scrape/scrape_toi.py and 3 other locations - About 5 hrs to fix
scrapenhl2/scrape/parse_pbp.py on lines 36..40
scrapenhl2/scrape/parse_toi.py on lines 48..52
scrapenhl2/scrape/scrape_pbp.py on lines 221..225

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 86.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 2 locations. Consider refactoring.
Open

def get_raw_toi(season, game):
    """
    Loads the compressed json file containing this game's shifts from disk.

    :param season: int, the season
Severity: Major
Found in scrapenhl2/scrape/scrape_toi.py and 1 other location - About 2 hrs to fix
scrapenhl2/scrape/scrape_pbp.py on lines 108..119

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 60.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 4 locations. Consider refactoring.
Open

def scrape_toi_setup():
    """
    Creates raw toi folders if need be

    :return:
Severity: Major
Found in scrapenhl2/scrape/scrape_toi.py and 3 other locations - About 1 hr to fix
scrapenhl2/scrape/parse_pbp.py on lines 355..362
scrapenhl2/scrape/parse_toi.py on lines 558..565
scrapenhl2/scrape/scrape_pbp.py on lines 228..235

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 43.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Identical blocks of code found in 2 locations. Consider refactoring.
Open

    if homeroad == 'H':
        filename = get_home_shiftlog_filename(season, game)
    elif homeroad == 'R':
        filename = get_road_shiftlog_filename(season, game)
Severity: Major
Found in scrapenhl2/scrape/scrape_toi.py and 1 other location - About 1 hr to fix
scrapenhl2/scrape/scrape_toi.py on lines 138..141

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 41.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Identical blocks of code found in 2 locations. Consider refactoring.
Open

    if homeroad == 'H':
        filename = get_home_shiftlog_filename(season, game)
    elif homeroad == 'R':
        filename = get_road_shiftlog_filename(season, game)
Severity: Major
Found in scrapenhl2/scrape/scrape_toi.py and 1 other location - About 1 hr to fix
scrapenhl2/scrape/scrape_toi.py on lines 117..120

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 41.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 7 locations. Consider refactoring.
Open

def get_game_raw_toi_filename(season, game):
    """
    Returns the filename of the raw toi folder

    :param season: int, current season
Severity: Major
Found in scrapenhl2/scrape/scrape_toi.py and 6 other locations - About 1 hr to fix
scrapenhl2/scrape/parse_pbp.py on lines 249..258
scrapenhl2/scrape/parse_toi.py on lines 524..533
scrapenhl2/scrape/scrape_pbp.py on lines 173..182
scrapenhl2/scrape/scrape_pbp.py on lines 185..194
scrapenhl2/scrape/scrape_toi.py on lines 38..47
scrapenhl2/scrape/scrape_toi.py on lines 50..58

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 38.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 7 locations. Consider refactoring.
Open

def get_home_shiftlog_filename(season, game):
    """
    Returns the filename of the parsed toi html home shifts

    :param season: int, the season
Severity: Major
Found in scrapenhl2/scrape/scrape_toi.py and 6 other locations - About 1 hr to fix
scrapenhl2/scrape/parse_pbp.py on lines 249..258
scrapenhl2/scrape/parse_toi.py on lines 524..533
scrapenhl2/scrape/scrape_pbp.py on lines 173..182
scrapenhl2/scrape/scrape_pbp.py on lines 185..194
scrapenhl2/scrape/scrape_toi.py on lines 50..58
scrapenhl2/scrape/scrape_toi.py on lines 197..206

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 38.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

Similar blocks of code found in 7 locations. Consider refactoring.
Open

def get_road_shiftlog_filename(season, game):
    """
    Returns the filename of the parsed toi html road shifts

    :param season: int, current season
Severity: Major
Found in scrapenhl2/scrape/scrape_toi.py and 6 other locations - About 1 hr to fix
scrapenhl2/scrape/parse_pbp.py on lines 249..258
scrapenhl2/scrape/parse_toi.py on lines 524..533
scrapenhl2/scrape/scrape_pbp.py on lines 173..182
scrapenhl2/scrape/scrape_pbp.py on lines 185..194
scrapenhl2/scrape/scrape_toi.py on lines 38..47
scrapenhl2/scrape/scrape_toi.py on lines 197..206

Duplicated Code

Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

Tuning

This issue has a mass of 38.

We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

Refactorings

Further Reading

There are no issues that match your filters.

Category
Status