File RobotsTxtParser.php
has 353 lines of code (exceeds 250 allowed). Consider refactoring. Open
<?php declare(strict_types=1);
namespace t1gor\RobotsTxtParser;
use Psr\Log\LoggerAwareInterface;
The class RobotsTxtParser has an overall complexity of 103 which is very high. The configured complexity threshold is 50. Open
class RobotsTxtParser implements LoggerAwareInterface {
use LogsIfAvailableTrait;
// default encoding
- Exclude checks
RobotsTxtParser
has 27 functions (exceeds 20 allowed). Consider refactoring. Open
class RobotsTxtParser implements LoggerAwareInterface {
use LogsIfAvailableTrait;
// default encoding
Function parseURL
has a Cognitive Complexity of 15 (exceeds 5 allowed). Consider refactoring. Open
protected function parseURL($url) {
$parsed = parse_url($url);
if ($parsed === false) {
return false;
} elseif (!isset($parsed['scheme']) || !$this->isValidScheme($parsed['scheme'])) {
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
Function render
has a Cognitive Complexity of 13 (exceeds 5 allowed). Consider refactoring. Open
public function render($eol = "\r\n") {
$input = $this->getRules();
krsort($input);
$output = [];
foreach ($input as $userAgent => $rules) {
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
Method render
has 30 lines of code (exceeds 25 allowed). Consider refactoring. Open
public function render($eol = "\r\n") {
$input = $this->getRules();
krsort($input);
$output = [];
foreach ($input as $userAgent => $rules) {
Function getSitemaps
has a Cognitive Complexity of 10 (exceeds 5 allowed). Consider refactoring. Open
public function getSitemaps(?string $userAgent = null): array {
$this->buildTree();
$maps = [];
if (!is_null($userAgent)) {
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
Function checkRules
has a Cognitive Complexity of 10 (exceeds 5 allowed). Consider refactoring. Open
protected function checkRules(string $rule, string $path, string $userAgent = '*'): bool {
// check for disallowed http status code
if ($this->checkHttpStatusCodeRule()) {
return ($rule === Directive::DISALLOW);
}
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
Method __construct
has 5 arguments (exceeds 4 allowed). Consider refactoring. Open
$content,
string $encoding = self::DEFAULT_ENCODING,
?TreeBuilderInterface $treeBuilder = null,
?ReaderInterface $reader = null,
?UserAgentMatcherInterface $userAgentMatcher = null
Function getHost
has a Cognitive Complexity of 7 (exceeds 5 allowed). Consider refactoring. Open
public function getHost(?string $userAgent = null) {
$this->buildTree();
if (!is_null($userAgent)) {
$userAgent = $this->userAgentMatcher->getMatching($userAgent, array_keys($this->tree));
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
Avoid too many return
statements within this method. Open
return $parsed;
The method parseURL() has a Cyclomatic Complexity of 10. The configured cyclomatic complexity threshold is 10. Open
protected function parseURL($url) {
$parsed = parse_url($url);
if ($parsed === false) {
return false;
} elseif (!isset($parsed['scheme']) || !$this->isValidScheme($parsed['scheme'])) {
- Read upRead up
- Exclude checks
CyclomaticComplexity
Since: 0.1
Complexity is determined by the number of decision points in a method plus one for the method entry. The decision points are 'if', 'while', 'for', and 'case labels'. Generally, 1-4 is low complexity, 5-7 indicates moderate complexity, 8-10 is high complexity, and 11+ is very high complexity.
Example
// Cyclomatic Complexity = 11
class Foo {
1 public function example() {
2 if ($a == $b) {
3 if ($a1 == $b1) {
fiddle();
4 } elseif ($a2 == $b2) {
fiddle();
} else {
fiddle();
}
5 } elseif ($c == $d) {
6 while ($c == $d) {
fiddle();
}
7 } elseif ($e == $f) {
8 for ($n = 0; $n < $h; $n++) {
fiddle();
}
} else {
switch ($z) {
9 case 1:
fiddle();
break;
10 case 2:
fiddle();
break;
11 case 3:
fiddle();
break;
default:
fiddle();
break;
}
}
}
}
Source https://phpmd.org/rules/codesize.html#cyclomaticcomplexity
The class RobotsTxtParser has a coupling between objects value of 15. Consider to reduce the number of dependencies under 13. Open
class RobotsTxtParser implements LoggerAwareInterface {
use LogsIfAvailableTrait;
// default encoding
- Read upRead up
- Exclude checks
CouplingBetweenObjects
Since: 1.1.0
A class with too many dependencies has negative impacts on several quality aspects of a class. This includes quality criteria like stability, maintainability and understandability
Example
class Foo {
/**
* @var \foo\bar\X
*/
private $x = null;
/**
* @var \foo\bar\Y
*/
private $y = null;
/**
* @var \foo\bar\Z
*/
private $z = null;
public function setFoo(\Foo $foo) {}
public function setBar(\Bar $bar) {}
public function setBaz(\Baz $baz) {}
/**
* @return \SplObjectStorage
* @throws \OutOfRangeException
* @throws \InvalidArgumentException
* @throws \ErrorException
*/
public function process(\Iterator $it) {}
// ...
}
Source https://phpmd.org/rules/design.html#couplingbetweenobjects
Missing class import via use statement (line '236', column '13'). Open
throw new \RuntimeException(WarmingMessages::SET_UA_DEPRECATED);
- Read upRead up
- Exclude checks
MissingImport
Since: 2.7.0
Importing all external classes in a file through use statements makes them clearly visible.
Example
function make() {
return new \stdClass();
}
Source http://phpmd.org/rules/cleancode.html#MissingImport
Avoid using static access to class '\t1gor\RobotsTxtParser\Parser\HostName' in method 'isValidHostName'. Open
return HostName::isValid($host);
- Read upRead up
- Exclude checks
StaticAccess
Since: 1.4.0
Static access causes unexchangeable dependencies to other classes and leads to hard to test code. Avoid using static access at all costs and instead inject dependencies through the constructor. The only case when static access is acceptable is when used for factory methods.
Example
class Foo
{
public function bar()
{
Bar::baz();
}
}
Source https://phpmd.org/rules/cleancode.html#staticaccess
Avoid using static access to class 't1gor\RobotsTxtParser\Directive' in method 'checkRuleSwitch'. Open
switch (Directive::attemptGetInline($rule)) {
- Read upRead up
- Exclude checks
StaticAccess
Since: 1.4.0
Static access causes unexchangeable dependencies to other classes and leads to hard to test code. Avoid using static access at all costs and instead inject dependencies through the constructor. The only case when static access is acceptable is when used for factory methods.
Example
class Foo
{
public function bar()
{
Bar::baz();
}
}
Source https://phpmd.org/rules/cleancode.html#staticaccess
Avoid using static access to class 't1gor\RobotsTxtParser\Directive' in method 'checkRuleSwitch'. Open
if ($this->checkHostRule(Directive::stripInline($rule))) {
- Read upRead up
- Exclude checks
StaticAccess
Since: 1.4.0
Static access causes unexchangeable dependencies to other classes and leads to hard to test code. Avoid using static access at all costs and instead inject dependencies through the constructor. The only case when static access is acceptable is when used for factory methods.
Example
class Foo
{
public function bar()
{
Bar::baz();
}
}
Source https://phpmd.org/rules/cleancode.html#staticaccess
The method getSitemaps uses an else expression. Else clauses are basically not necessary and you can simplify the code by not using them. Open
} else {
foreach ($this->tree as $userAgentBased) {
if (isset($userAgentBased[Directive::SITEMAP]) && !empty($userAgentBased[Directive::SITEMAP])) {
$maps = array_merge($maps, $userAgentBased[Directive::SITEMAP]);
}
- Read upRead up
- Exclude checks
ElseExpression
Since: 1.4.0
An if expression with an else branch is basically not necessary. You can rewrite the conditions in a way that the else clause is not necessary and the code becomes simpler to read. To achieve this, use early return statements, though you may need to split the code it several smaller methods. For very simple assignments you could also use the ternary operations.
Example
class Foo
{
public function bar($flag)
{
if ($flag) {
// one branch
} else {
// another branch
}
}
}
Source https://phpmd.org/rules/cleancode.html#elseexpression
The method parseURL uses an else expression. Else clauses are basically not necessary and you can simplify the code by not using them. Open
} else {
if (!isset($parsed['host']) || !$this->isValidHostName($parsed['host'])) {
return false;
} else {
if (!isset($parsed['port'])) {
- Read upRead up
- Exclude checks
ElseExpression
Since: 1.4.0
An if expression with an else branch is basically not necessary. You can rewrite the conditions in a way that the else clause is not necessary and the code becomes simpler to read. To achieve this, use early return statements, though you may need to split the code it several smaller methods. For very simple assignments you could also use the ternary operations.
Example
class Foo
{
public function bar($flag)
{
if ($flag) {
// one branch
} else {
// another branch
}
}
}
Source https://phpmd.org/rules/cleancode.html#elseexpression
Avoid using static access to class 't1gor\RobotsTxtParser\Directive' in method 'checkRuleSwitch'. Open
if ($this->checkCleanParamRule(Directive::stripInline($rule), $path)) {
- Read upRead up
- Exclude checks
StaticAccess
Since: 1.4.0
Static access causes unexchangeable dependencies to other classes and leads to hard to test code. Avoid using static access at all costs and instead inject dependencies through the constructor. The only case when static access is acceptable is when used for factory methods.
Example
class Foo
{
public function bar()
{
Bar::baz();
}
}
Source https://phpmd.org/rules/cleancode.html#staticaccess
The method render uses an else expression. Else clauses are basically not necessary and you can simplify the code by not using them. Open
} else {
$output[] = $directive . ': ' . $value;
}
- Read upRead up
- Exclude checks
ElseExpression
Since: 1.4.0
An if expression with an else branch is basically not necessary. You can rewrite the conditions in a way that the else clause is not necessary and the code becomes simpler to read. To achieve this, use early return statements, though you may need to split the code it several smaller methods. For very simple assignments you could also use the ternary operations.
Example
class Foo
{
public function bar($flag)
{
if ($flag) {
// one branch
} else {
// another branch
}
}
}
Source https://phpmd.org/rules/cleancode.html#elseexpression
The method parseURL uses an else expression. Else clauses are basically not necessary and you can simplify the code by not using them. Open
} else {
if (!isset($parsed['port'])) {
$parsed['port'] = getservbyname($parsed['scheme'], 'tcp');
if (!is_int($parsed['port'])) {
return false;
- Read upRead up
- Exclude checks
ElseExpression
Since: 1.4.0
An if expression with an else branch is basically not necessary. You can rewrite the conditions in a way that the else clause is not necessary and the code becomes simpler to read. To achieve this, use early return statements, though you may need to split the code it several smaller methods. For very simple assignments you could also use the ternary operations.
Example
class Foo
{
public function bar($flag)
{
if ($flag) {
// one branch
} else {
// another branch
}
}
}
Source https://phpmd.org/rules/cleancode.html#elseexpression
Avoid using static access to class '\t1gor\RobotsTxtParser\Stream\GeneratorBasedReader' in method '__construct'. Open
? GeneratorBasedReader::fromStream($content)
- Read upRead up
- Exclude checks
StaticAccess
Since: 1.4.0
Static access causes unexchangeable dependencies to other classes and leads to hard to test code. Avoid using static access at all costs and instead inject dependencies through the constructor. The only case when static access is acceptable is when used for factory methods.
Example
class Foo
{
public function bar()
{
Bar::baz();
}
}
Source https://phpmd.org/rules/cleancode.html#staticaccess
Avoid using static access to class '\t1gor\RobotsTxtParser\Parser\DirectiveProcessorsFactory' in method 'buildTree'. Open
DirectiveProcessorsFactory::getDefault($this->logger),
- Read upRead up
- Exclude checks
StaticAccess
Since: 1.4.0
Static access causes unexchangeable dependencies to other classes and leads to hard to test code. Avoid using static access at all costs and instead inject dependencies through the constructor. The only case when static access is acceptable is when used for factory methods.
Example
class Foo
{
public function bar()
{
Bar::baz();
}
}
Source https://phpmd.org/rules/cleancode.html#staticaccess
Avoid using static access to class '\t1gor\RobotsTxtParser\Stream\GeneratorBasedReader' in method '__construct'. Open
: GeneratorBasedReader::fromString($content);
- Read upRead up
- Exclude checks
StaticAccess
Since: 1.4.0
Static access causes unexchangeable dependencies to other classes and leads to hard to test code. Avoid using static access at all costs and instead inject dependencies through the constructor. The only case when static access is acceptable is when used for factory methods.
Example
class Foo
{
public function bar()
{
Bar::baz();
}
}
Source https://phpmd.org/rules/cleancode.html#staticaccess
Avoid using static access to class '\t1gor\RobotsTxtParser\Parser\Url' in method 'isValidScheme'. Open
return Url::isValidScheme($scheme);
- Read upRead up
- Exclude checks
StaticAccess
Since: 1.4.0
Static access causes unexchangeable dependencies to other classes and leads to hard to test code. Avoid using static access at all costs and instead inject dependencies through the constructor. The only case when static access is acceptable is when used for factory methods.
Example
class Foo
{
public function bar()
{
Bar::baz();
}
}
Source https://phpmd.org/rules/cleancode.html#staticaccess
Avoid unused private fields such as '$content'. Open
private $content = '';
- Read upRead up
- Exclude checks
UnusedPrivateField
Since: 0.2
Detects when a private field is declared and/or assigned a value, but not used.
Example
class Something
{
private static $FOO = 2; // Unused
private $i = 5; // Unused
private $j = 6;
public function addOne()
{
return $this->j++;
}
}
Source https://phpmd.org/rules/unusedcode.html#unusedprivatefield
Avoid unused private fields such as '$userAgent'. Open
private $userAgent = '*';
- Read upRead up
- Exclude checks
UnusedPrivateField
Since: 0.2
Detects when a private field is declared and/or assigned a value, but not used.
Example
class Something
{
private static $FOO = 2; // Unused
private $i = 5; // Unused
private $j = 6;
public function addOne()
{
return $this->j++;
}
}
Source https://phpmd.org/rules/unusedcode.html#unusedprivatefield
Avoid unused parameters such as '$userAgent'. Open
public function setUserAgent(string $userAgent) {
- Read upRead up
- Exclude checks
UnusedFormalParameter
Since: 0.2
Avoid passing parameters to methods or constructors and then not using those parameters.
Example
class Foo
{
private function bar($howdy)
{
// $howdy is not used
}
}
Source https://phpmd.org/rules/unusedcode.html#unusedformalparameter
Avoid variables with short names like $b. Configured minimum length is 3. Open
usort($value, function ($a, $b) {
- Read upRead up
- Exclude checks
ShortVariable
Since: 0.2
Detects when a field, local, or parameter has a very short name.
Example
class Something {
private $q = 15; // VIOLATION - Field
public static function main( array $as ) { // VIOLATION - Formal
$r = 20 + $this->q; // VIOLATION - Local
for (int $i = 0; $i < 10; $i++) { // Not a Violation (inside FOR)
$r += $this->q;
}
}
}
Source https://phpmd.org/rules/naming.html#shortvariable
Avoid variables with short names like $a. Configured minimum length is 3. Open
usort($value, function ($a, $b) {
- Read upRead up
- Exclude checks
ShortVariable
Since: 0.2
Detects when a field, local, or parameter has a very short name.
Example
class Something {
private $q = 15; // VIOLATION - Field
public static function main( array $as ) { // VIOLATION - Formal
$r = 20 + $this->q; // VIOLATION - Local
for (int $i = 0; $i < 10; $i++) { // Not a Violation (inside FOR)
$r += $this->q;
}
}
}
Source https://phpmd.org/rules/naming.html#shortvariable
Scope keyword "private" must be followed by a single space Open
private $content = '';
- Exclude checks
Opening brace of a class must be on the line after the definition Open
class RobotsTxtParser implements LoggerAwareInterface {
- Exclude checks
Blank line found at start of control structure Open
switch (Directive::attemptGetInline($rule)) {
- Exclude checks
CASE statements must be defined using a colon Open
case Directive::HOST;
- Exclude checks
Only one argument is allowed per line in a multi-line function call Open
$host, [
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private $url = null;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private $content = '';
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private array $tree = [];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function __construct(
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$content,
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (is_null($this->treeBuilder)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
protected $rules = [];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// UserAgent
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if ($this->reader instanceof LoggerAwareInterface) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->reader->setLogger($this->logger);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
use LogsIfAvailableTrait;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log('Reader is not passed, using a default one...');
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
: GeneratorBasedReader::fromString($content);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
const DEFAULT_ENCODING = 'UTF-8';
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// host set
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->encoding = $encoding;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (is_null($this->reader)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
? GeneratorBasedReader::fromStream($content)
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!empty($this->tree)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->logger;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->userAgentMatcher->setLogger($this->logger);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $url
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
?TreeBuilderInterface $treeBuilder = null,
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->reader = is_resource($content)
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log('UserAgentMatcher is not passed, using a default one...');
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if ($this->userAgentMatcher instanceof LoggerAwareInterface) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
protected function parseURL($url) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$parsed = parse_url($url);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $rule
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return void
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
throw new \RuntimeException(WarmingMessages::SET_UA_DEPRECATED);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if ($this->checkHttpStatusCodeRule()) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
foreach ([Directive::DISALLOW, Directive::ALLOW] as $directive) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!isset($this->tree[$userAgent][$directive])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Check HTTP status code rule
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->reader->setEncoding($this->encoding);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$param = explode('&', $array[0]);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $cleanParam;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private string $encoding = '';
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
?UserAgentMatcherInterface $userAgentMatcher = null
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private static function isValidHostName(string $host): bool {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Line exceeds 120 characters; contains 130 characters Open
$cleanParam['path'] = isset($array[1]) ? $this->encode_url(preg_replace('/[^A-Za-z0-9\.-\/\*\_]/', '', $array[1])) : '/*';
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
!is_null($this->logger) && $url->setLogger($this->logger);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $userAgent
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// rule match
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
protected ?int $httpStatusCode;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// url
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->reader = $reader;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (is_null($this->userAgentMatcher)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private function buildTree() {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->logger
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// check for disallowed http status code
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $result;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private ?TreeBuilderInterface $treeBuilder;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->treeBuilder = $treeBuilder;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->userAgentMatcher = new UserAgentMatcher();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->treeBuilder = new TreeBuilder(
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Validate URL scheme
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// strip multi-spaces
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$cleanParam['param'][] = trim($key);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if ($this->checkRuleSwitch($robotRule, $path)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$result = ($rule === $directive);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->tree = $this->treeBuilder->build();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->logger = $logger;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return Url::isValidScheme($scheme);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
} elseif (!isset($parsed['scheme']) || !$this->isValidScheme($parsed['scheme'])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$array = explode(' ', $rule, 2);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param int $code
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $userAgent - which robot to check for
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
protected function checkRules(string $rule, string $path, string $userAgent = '*'): bool {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return true;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
switch (Directive::attemptGetInline($rule)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
break;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
protected $host = null;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
} else {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$parsed['port'] = getservbyname($parsed['scheme'], 'tcp');
- Exclude checks
Line exceeds 120 characters; contains 135 characters Open
$parsed['custom'] = (isset($parsed['path']) ? $parsed['path'] : '/') . (isset($parsed['query']) ? '?' . $parsed['query'] : '');
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$cleanParam = [];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->checkRules(Directive::ALLOW, $url->getPath(), $userAgent);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $rule - rule to check
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$userAgent = $this->userAgentMatcher->getMatching($userAgent, array_keys($this->tree));
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// rules set
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// robots.txt http status code
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// robots.txt file content
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
?ReaderInterface $reader = null,
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $scheme
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
} else {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
foreach ($this->tree[$userAgent][$directive] as $robotRule) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// check rule
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if ($this->checkHostRule(Directive::stripInline($rule))) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log('Creating a default tree builder as none passed...');
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
DirectiveProcessorsFactory::getDefault($this->logger),
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->treeBuilder->setContent($this->reader->getContentIterated());
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function getLogger(): ?LoggerInterface {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return HostName::isValid($host);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!isset($parsed['port'])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return bool
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function isAllowed(string $url, ?string $userAgent = '*'): bool {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @deprecated please check rules for exact user agent instead
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $path - path to check
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return bool
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$result = ($rule === Directive::ALLOW);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log("Disallowed by HTTP status code {$this->httpStatusCode}");
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return true;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// default encoding
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private ?UserAgentMatcherInterface $userAgentMatcher;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if ($this->encoding !== static::DEFAULT_ENCODING) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// construct a tree builder if not passed
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function setLogger(LoggerInterface $logger): void {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$rule = preg_replace('/\s+/S', ' ', $rule);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return true;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Set UserAgent
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return true;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// match result
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Check basic rule
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (mb_strrpos($value, '/') == (mb_strlen($value) - 1)
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$directive = in_array($type, [Directive::CACHE, Directive::CACHE_DELAY])
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log("{$directive} directive (unofficial): Not found, fallback to " . Directive::CRAWL_DELAY . " directive");
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->tree[$userAgent][Directive::CRAWL_DELAY];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @deprecated
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->reader->getContentRaw();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private $userAgent = '*';
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private ?ReaderInterface $reader;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
string $encoding = self::DEFAULT_ENCODING,
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->userAgentMatcher = $userAgentMatcher;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if ($parsed === false) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$parsed['custom'] = (isset($parsed['path']) ? $parsed['path'] : '/') . (isset($parsed['query']) ? '?' . $parsed['query'] : '');
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return array
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// strip any invalid characters from path prefix
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$cleanParam['path'] = isset($array[1]) ? $this->encode_url(preg_replace('/[^A-Za-z0-9\.-\/\*\_]/', '', $array[1])) : '/*';
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function setHttpStatusCode(int $code): bool {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!is_int($code) || $code < 100 || $code > 599) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->buildTree();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Check rules
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $path
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $value;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->buildTree();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!isset($parsed['host']) || !$this->isValidHostName($parsed['host'])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private function explodeCleanParamRule($rule) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// Check each directive for rules, allowed by default
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
case Directive::HOST;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
&& !strpos($path, "&$param=")
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$escaped = strtr($this->prepareRegexRule($rule), ['@' => '\@']);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$url['host'] . ':' . $url['port'],
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$url['scheme'] . '://' . $url['host'],
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return true;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $url - url to check
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
!is_null($this->logger) && $url->setLogger($this->logger);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
: Directive::CRAWL_DELAY;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$output[] = '';
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->buildTree();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if ($userAgent === null) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return null;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->buildTree();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (isset($this->tree[$userAgent][Directive::SITEMAP]) && !empty($this->tree[$userAgent][Directive::SITEMAP])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return bool
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Parse URL
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return array|false
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// split into parameter and path
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->httpStatusCode = $code;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private function checkHttpStatusCodeRule(): bool {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $rule
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->buildTree();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// return delay for requested directive
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (isset($this->tree[$userAgent][Directive::CRAWL_DELAY])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return 0;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return array
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
foreach ($input as $userAgent => $rules) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$directive = ucfirst($directive);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return mb_strlen($a) < mb_strlen($b);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$host = $this->getHost();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return [];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!is_null($userAgent)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!is_int($parsed['port'])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Explode Clean-Param rule
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$url = new Url($url);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
continue;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
case Directive::CLEAN_PARAM:
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
default:
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
|| mb_strrpos($value, '=') == (mb_strlen($value) - 1)
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
]
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!isset($this->tree[Directive::CLEAN_PARAM]) || empty($this->tree[Directive::CLEAN_PARAM])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log(Directive::CLEAN_PARAM . ' directive: Not found');
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
usort($value, function ($a, $b) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
foreach ($this->tree as $userAgentBased) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
|| mb_strrpos($value, '?') == (mb_strlen($value) - 1)
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (substr($value, 0, 2) != '.*') {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$host, [
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function isDisallowed(string $url, string $userAgent = '*'): bool {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
} else {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$userAgent = $this->userAgentMatcher->getMatching($userAgent, array_keys($this->tree));
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
} else {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Set the HTTP status code
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return ($rule === Directive::DISALLOW);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return bool
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (isset($this->httpStatusCode) && $this->httpStatusCode >= 500 && $this->httpStatusCode <= 599) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $rule
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return true;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$host = trim(str_ireplace(Directive::HOST . ':', '', mb_strtolower($rule)));
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function getCleanParam(): array {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return string
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (isset($this->tree[$userAgent][Directive::HOST]) && !empty($this->tree[$userAgent][Directive::HOST])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (mb_strlen($value) > 2 && mb_substr($value, -2) == '\$') {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private function checkHostRule($rule) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log($error_msg, [], LogLevel::ERROR);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$url['host'],
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$url['scheme'] . '://' . $url['host'] . ':' . $url['port'],
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log('Rule match: ' . Directive::HOST . ' directive');
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return [];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
krsort($input);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if ($host !== null) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function getRules(?string $userAgent = null) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$maps = array_merge($maps, $userAgentBased[Directive::SITEMAP]);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (preg_match('@' . $escaped . '@', $path)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return true;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return false;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$error_msg = WarmingMessages::INLINED_HOST;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (in_array(
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->tree[$userAgent][$directive];
- Exclude checks
Line exceeds 120 characters; contains 127 characters Open
$this->log("{$directive} directive (unofficial): Not found, fallback to " . Directive::CRAWL_DELAY . " directive");
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
foreach ($sitemaps as $sitemap) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return string[]|string|null
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$hosts = [];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
foreach ($param as $key) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function setUserAgent(string $userAgent) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Check Clean-Param rule
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log('Rule match: Path');
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
protected function prepareRegexRule(string $value): string {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$value = substr($value, 0, -2) . '$';
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$url = $this->parseURL($this->url);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->tree[Directive::CLEAN_PARAM];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$output = [];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (is_array($value)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$userAgent = $this->userAgentMatcher->getMatching($userAgent, array_keys($this->tree));
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (isset($this->tree[$userAgent])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
array_push($hosts, $userAgentBased[Directive::HOST]);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!$this->checkBasicRule($cleanParam['path'], $path)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!isset($this->url)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @deprecated
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$output[] = $directive . ': ' . $subValue;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$output[] = 'Sitemap: ' . $sitemap;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log(sprintf("No direct match found for '%s', fallback to *", $userAgent));
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return !empty($hosts) ? $hosts : null;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return bool
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function getDelay(string $userAgent = "*", string $type = Directive::CRAWL_DELAY) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (isset($this->tree[$userAgent][$directive])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->buildTree();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function getContent(): string {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Render
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function render($eol = "\r\n") {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$input = $this->getRules();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$output[] = 'User-agent: ' . $userAgent;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return implode($eol, $output);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$userAgent = $this->userAgentMatcher->getMatching($userAgent, array_keys($this->tree));
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->checkBasicRule($rule, $path);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// Shorter paths later
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$cleanParam = $this->explodeCleanParamRule($rule);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
});
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private function checkBasicRule(string $rule, string $path): bool {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->tree;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return bool
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// fallback for *
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (isset($this->tree['*'])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->tree['*'];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string $eol
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function getHost(?string $userAgent = null) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
foreach ($this->tree as $userAgentBased) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log(sprintf("Rules not found for the given User-Agent '%s'", $userAgent));
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (isset($userAgentBased[Directive::HOST]) && !empty($userAgentBased[Directive::HOST])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$maps = [];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->tree[$userAgent][Directive::SITEMAP];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
break;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private function checkCleanParamRule($rule, $path) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!strpos($path, "?$param=")
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// change @ to \@
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$escape = ['$' => '\$', '?' => '\?', '.' => '\.', '*' => '.*', '[' => '\[', ']' => '\]'];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// Not multibyte
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Check Host rule
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @return bool
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if ($this->checkCleanParamRule(Directive::stripInline($rule), $path)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (!is_null($userAgent)) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$output[] = $directive . ': ' . $value;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
if (isset($userAgentBased[Directive::SITEMAP]) && !empty($userAgentBased[Directive::SITEMAP])) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// check if path prefix matches the path of the url we're checking
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$sitemaps = $this->getSitemaps();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log('Rule match: ' . Directive::CLEAN_PARAM . ' directive');
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @note NULL is returned to public API compatibility reasons. Will be removed in the future.
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$value = str_replace(array_keys($escape), array_values($escape), $value);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function getSitemaps(?string $userAgent = null): array {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$value .= '.*';
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$value = '^' . $value;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* Check url wrapper
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param string|null $userAgent - which robot to check for
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$url = new Url($url);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->checkRules(Directive::DISALLOW, $url->getPath(), $userAgent);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log("$directive directive: Not found");
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
public function getLog(): array {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
foreach ($rules as $directive => $value) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
foreach ($value as $subValue) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$output[] = 'Host: ' . $host;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// return all rules
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
// direct match
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $maps;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
private static function isValidScheme($scheme) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $parsed;
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->log('Invalid HTTP status code, not taken into account.', ['code' => $code], LogLevel::WARNING);
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
*/
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
protected function checkRuleSwitch(string $rule, string $path): bool {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
/**
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
foreach ($cleanParam['param'] as $param) {
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
? Directive::CACHE_DELAY
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @see RobotsTxtParser::getLogger()
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
}
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$output[] = '';
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->tree[$userAgent];
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
* @param ?string $userAgent
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
$this->buildTree();
- Exclude checks
Spaces must be used to indent lines; tabs are not allowed Open
return $this->tree[$userAgent][Directive::HOST];
- Exclude checks
Line exceeds 120 characters; contains 124 characters Open
if (isset($this->tree[$userAgent][Directive::SITEMAP]) && !empty($this->tree[$userAgent][Directive::SITEMAP])) {
- Exclude checks
Opening brace should be on a new line Open
public function getLogger(): ?LoggerInterface {
- Exclude checks
Opening brace should be on a new line Open
private static function isValidHostName(string $host): bool {
- Exclude checks
Opening brace should be on a new line Open
public function setLogger(LoggerInterface $logger): void {
- Exclude checks
Opening brace should be on a new line Open
protected function checkRules(string $rule, string $path, string $userAgent = '*'): bool {
- Exclude checks
Opening brace should be on a new line Open
private static function isValidScheme($scheme) {
- Exclude checks
Opening brace should be on a new line Open
private function buildTree() {
- Exclude checks
Opening brace should be on a new line Open
public function isAllowed(string $url, ?string $userAgent = '*'): bool {
- Exclude checks
Opening brace should be on a new line Open
protected function parseURL($url) {
- Exclude checks
Opening brace should be on a new line Open
private function explodeCleanParamRule($rule) {
- Exclude checks
Opening brace should be on a new line Open
public function setHttpStatusCode(int $code): bool {
- Exclude checks
Opening brace should be on a new line Open
public function getCleanParam(): array {
- Exclude checks
Opening brace should be on a new line Open
public function getLog(): array {
- Exclude checks
Opening brace should be on a new line Open
public function setUserAgent(string $userAgent) {
- Exclude checks
Opening brace should be on a new line Open
protected function checkRuleSwitch(string $rule, string $path): bool {
- Exclude checks
Opening brace should be on a new line Open
public function getContent(): string {
- Exclude checks
Opening brace should be on a new line Open
private function checkHostRule($rule) {
- Exclude checks
Opening brace should be on a new line Open
public function getSitemaps(?string $userAgent = null): array {
- Exclude checks
Opening brace should be on a new line Open
public function getRules(?string $userAgent = null) {
- Exclude checks
Opening brace should be on a new line Open
private function checkCleanParamRule($rule, $path) {
- Exclude checks
Opening brace should be on a new line Open
public function render($eol = "\r\n") {
- Exclude checks
Opening brace should be on a new line Open
public function getHost(?string $userAgent = null) {
- Exclude checks
Opening brace should be on a new line Open
private function checkBasicRule(string $rule, string $path): bool {
- Exclude checks
Opening brace should be on a new line Open
public function getDelay(string $userAgent = "*", string $type = Directive::CRAWL_DELAY) {
- Exclude checks
Opening brace should be on a new line Open
private function checkHttpStatusCodeRule(): bool {
- Exclude checks
Opening brace should be on a new line Open
protected function prepareRegexRule(string $value): string {
- Exclude checks
Opening brace should be on a new line Open
public function isDisallowed(string $url, string $userAgent = '*'): bool {
- Exclude checks
The variable $error_msg is not named in camelCase. Open
private function checkHostRule($rule) {
if (!isset($this->url)) {
$error_msg = WarmingMessages::INLINED_HOST;
$this->log($error_msg, [], LogLevel::ERROR);
return false;
- Read upRead up
- Exclude checks
CamelCaseVariableName
Since: 0.2
It is considered best practice to use the camelCase notation to name variables.
Example
class ClassName {
public function doSomething() {
$data_module = new DataModule();
}
}
Source
The variable $error_msg is not named in camelCase. Open
private function checkHostRule($rule) {
if (!isset($this->url)) {
$error_msg = WarmingMessages::INLINED_HOST;
$this->log($error_msg, [], LogLevel::ERROR);
return false;
- Read upRead up
- Exclude checks
CamelCaseVariableName
Since: 0.2
It is considered best practice to use the camelCase notation to name variables.
Example
class ClassName {
public function doSomething() {
$data_module = new DataModule();
}
}