daemonslayer/tests-airflow

View on GitHub

Showing 51 of 51 total issues

Method gsob has 27 lines of code (exceeds 25 allowed). Consider refactoring.
Open

    public static TableSchema gsob() throws IOException {
        List<TableFieldSchema> fields = new ArrayList<>();
        fields.add(new TableFieldSchema().setName("partition_date").setType("DATE").setMode("NULLABLE"));
        fields.add(new TableFieldSchema().setName("temperature_mean").setType("FLOAT").setMode("NULLABLE"));
        fields.add(new TableFieldSchema().setName("temperature_min").setType("FLOAT").setMode("NULLABLE"));

    Function load_file has 8 arguments (exceeds 4 allowed). Consider refactoring.
    Open

        def load_file(
    Severity: Major
    Found in src/etl/examples/hive-example/dags/acme/hooks/hive_hooks.py - About 1 hr to fix

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

          def execute(self, context):
              waiting_time = 3 + random.random() * 3
              time.sleep(waiting_time)
      Severity: Minor
      Found in src/plugins/operators/predict.py and 1 other location - About 55 mins to fix
      src/plugins/operators/book_data.py on lines 15..17

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 37.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

      default_args = {
          'owner': 'robertjmoore',
          'depends_on_past': True,
          'start_date': datetime(2016, 7, 13),
          'email': ['robertj@robertjmoore.com'],
      Severity: Minor
      Found in src/singer/dags/singer.py and 1 other location - About 55 mins to fix
      src/tutorial/tutorial.py on lines 10..18

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 37.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

          def execute(self, context):
              waiting_time = 2 + random.random() * 2
              time.sleep(waiting_time)
      Severity: Minor
      Found in src/plugins/operators/book_data.py and 1 other location - About 55 mins to fix
      src/plugins/operators/predict.py on lines 18..20

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 37.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Similar blocks of code found in 2 locations. Consider refactoring.
      Open

      default_args = {
          'owner': 'airflow',
          'depends_on_past': False,
          'start_date': datetime(2015, 6, 1),
          'email': ['airflow@airflow.com'],
      Severity: Minor
      Found in src/tutorial/tutorial.py and 1 other location - About 55 mins to fix
      src/singer/dags/singer.py on lines 10..18

      Duplicated Code

      Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

      Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

      When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

      Tuning

      This issue has a mass of 37.

      We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

      The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

      If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

      See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

      Refactorings

      Further Reading

      Function to_csv has 7 arguments (exceeds 4 allowed). Consider refactoring.
      Open

          def to_csv(
      Severity: Major
      Found in src/etl/examples/hive-example/dags/acme/hooks/hive_hooks.py - About 50 mins to fix

        Function __init__ has 6 arguments (exceeds 4 allowed). Consider refactoring.
        Open

            def __init__(
        Severity: Minor
        Found in src/etl/examples/etl-example/dags/acme/operators/dwh_operators.py - About 45 mins to fix

          Similar blocks of code found in 2 locations. Consider refactoring.
          Open

          dag = DAG(
              dag_id='latest_only',
              schedule_interval=dt.timedelta(hours=4),
              start_date=airflow.utils.dates.days_ago(2),
          Severity: Minor
          Found in src/examples/example_latest_only.py and 1 other location - About 45 mins to fix
          src/examples/example_latest_only_with_trigger.py on lines 25..28

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 35.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Similar blocks of code found in 2 locations. Consider refactoring.
          Open

          ds_false = [DummyOperator(task_id='false_' + str(i), dag=dag) for i in [1, 2]]
          Severity: Minor
          Found in src/examples/example_short_circuit_operator.py and 1 other location - About 45 mins to fix
          src/examples/example_short_circuit_operator.py on lines 34..34

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 35.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Similar blocks of code found in 2 locations. Consider refactoring.
          Open

          dag = DAG(
              dag_id='latest_only_with_trigger',
              schedule_interval=dt.timedelta(hours=4),
              start_date=airflow.utils.dates.days_ago(2),
          Severity: Minor
          Found in src/examples/example_latest_only_with_trigger.py and 1 other location - About 45 mins to fix
          src/examples/example_latest_only.py on lines 25..28

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 35.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Similar blocks of code found in 2 locations. Consider refactoring.
          Open

          ds_true = [DummyOperator(task_id='true_' + str(i), dag=dag) for i in [1, 2]]
          Severity: Minor
          Found in src/examples/example_short_circuit_operator.py and 1 other location - About 45 mins to fix
          src/examples/example_short_circuit_operator.py on lines 35..35

          Duplicated Code

          Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

          Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

          When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

          Tuning

          This issue has a mass of 35.

          We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

          The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

          If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

          See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

          Refactorings

          Further Reading

          Avoid deeply nested control flow statements.
          Open

                                  if not rows:
                                      break
          
          
          Severity: Major
          Found in src/etl/examples/hive-example/dags/acme/hooks/hive_hooks.py - About 45 mins to fix

            Function __init__ has 5 arguments (exceeds 4 allowed). Consider refactoring.
            Open

                def __init__(self,
            Severity: Minor
            Found in src/etl/examples/file-ingest/acme/operators/file_operators.py - About 35 mins to fix

              Function __init__ has 5 arguments (exceeds 4 allowed). Consider refactoring.
              Open

                  def __init__(
              Severity: Minor
              Found in src/etl/examples/etl-example/dags/acme/operators/dwh_operators.py - About 35 mins to fix

                Function __init__ has 5 arguments (exceeds 4 allowed). Consider refactoring.
                Open

                    def __init__(self,
                Severity: Minor
                Found in src/etl/examples/file-ingest/acme/operators/file_operators.py - About 35 mins to fix

                  Function __init__ has 5 arguments (exceeds 4 allowed). Consider refactoring.
                  Open

                      def __init__(
                  Severity: Minor
                  Found in src/etl/examples/hive-example/dags/acme/hooks/hive_hooks.py - About 35 mins to fix

                    Function transfer_data_file has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring.
                    Open

                        def transfer_data_file(self, filepath, verbose=True):
                                filename = os.path.basename(filepath)
                                destination_path = '/user/cloudera/{0}'.format(filename)
                    
                                hdfs_cmd = ['hdfs','dfs','-put',filepath,destination_path]
                    Severity: Minor
                    Found in src/etl/examples/hive-example/dags/acme/hooks/hive_hooks.py - About 35 mins to fix

                    Cognitive Complexity

                    Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

                    A method's cognitive complexity is based on a few simple rules:

                    • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
                    • Code is considered more complex for each "break in the linear flow of the code"
                    • Code is considered more complex when "flow breaking structures are nested"

                    Further reading

                    Similar blocks of code found in 4 locations. Consider refactoring.
                    Open

                    yesterday = datetime.combine(datetime.today() - timedelta(1),
                                                 datetime.min.time())
                    Severity: Major
                    Found in src/gcp/dags/dataproc.py and 3 other locations - About 30 mins to fix
                    src/etl/examples/file-ingest/file_ingest.py on lines 22..24
                    src/gcp/dags/bigquery.py on lines 10..11
                    src/gcp/dags/cloud_storage.py on lines 7..8

                    Duplicated Code

                    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                    Tuning

                    This issue has a mass of 32.

                    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                    Refactorings

                    Further Reading

                    Similar blocks of code found in 3 locations. Consider refactoring.
                    Open

                    process_order_fact = PostgresOperatorWithTemplatedParams(
                        task_id='process_order_fact',
                        postgres_conn_id='postgres_dwh',
                        sql='process_order_fact.sql',
                        parameters={"window_start_date": "{{ ds }}", "window_end_date": "{{ tomorrow_ds }}"},
                    Severity: Minor
                    Found in src/etl/examples/etl-example/dags/process_order_fact.py and 2 other locations - About 30 mins to fix
                    src/etl/examples/etl-example/dags/process_dimensions.py on lines 54..58
                    src/etl/examples/etl-example/dags/process_dimensions.py on lines 62..66

                    Duplicated Code

                    Duplicated code can lead to software that is hard to understand and difficult to change. The Don't Repeat Yourself (DRY) principle states:

                    Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

                    When you violate DRY, bugs and maintenance problems are sure to follow. Duplicated code has a tendency to both continue to replicate and also to diverge (leaving bugs as two similar implementations differ in subtle ways).

                    Tuning

                    This issue has a mass of 32.

                    We set useful threshold defaults for the languages we support but you may want to adjust these settings based on your project guidelines.

                    The threshold configuration represents the minimum mass a code block must have to be analyzed for duplication. The lower the threshold, the more fine-grained the comparison.

                    If the engine is too easily reporting duplication, try raising the threshold. If you suspect that the engine isn't catching enough duplication, try lowering the threshold. The best setting tends to differ from language to language.

                    See codeclimate-duplication's documentation for more information about tuning the mass threshold in your .codeclimate.yml.

                    Refactorings

                    Further Reading

                    Severity
                    Category
                    Status
                    Source
                    Language