Strongly Sublinear Algorithms for Testing Pattern Freeness

For a permutation $\pi:[k] \to [k]$, a function $f:[n] \to \mathbb{R}$ contains a $\pi$-appearance if there exists $1 \leq i_1<i_2<\dots<i_k \leq n$ such that for all $s,t \in [k]$, $f(i_s)<f(i_t)$ if and only if $\pi(s)<\pi(t)$. The function is $\pi$-free if it has no $\pi$-appearances. In this paper, we investigate the problem of testing whether an input function $f$ is $\pi$-free or whether $f$ differs on at least $\varepsilon n$ values from every $\pi$-free function. This is a generalization of the well-studied monotonicity testing and was first studied by Newman, Rabinovich, Rajendraprasad and Sohler (Random Structures and Algorithms 2019). We show that for all constants $k \in \mathbb{N}$, $\varepsilon \in (0,1)$, and permutation $\pi:[k] \to [k]$, there is a one-sided error $\varepsilon$-testing algorithm for $\pi$-freeness of functions $f:[n] \to \mathbb{R}$ that makes $\tilde{O}(n^{o(1)})$ queries. We improve significantly upon the previous best upper bound $O(n^{1 - 1/(k-1)})$ by Ben-Eliezer and Canonne (SODA 2018). Our algorithm is adaptive, while the earlier best upper bound is known to be tight for nonadaptive algorithms.


Introduction
Given a permutation  : [] → [], a function  : [] → R contains a -appearance if there exists 1 ≤  1 <  2 < • • • <   ≤  such that for all ,  ∈ [] it holds that  (  ) <  (  ) if and only if () < ().In other words, the function values restricted to the indices { 1 , . . .,   } respect the ordering in .The function is -free if it has no -appearance.For instance, the set of all real-valued monotone non-decreasing functions over [] is (2, 1)-free.The notion of -freeness is well-studied in combinatorics, where the famous Stanley-Wilf conjecture about the bound on the number of -free permutations  : [] → [𝑛] has spawned a lot of work [13, 14, 5, 25, algorithms to determine whether a given permutation  : [] → [] is -free is an active area of research [2, 1, 10], with linear time algorithms for constant  [23,20].Apart from the theoretical interest, practical motivations to study -freeness include the study of motifs and patterns in time series analysis [11,31,24].
In this paper, we study property testing of -freeness, as proposed by Newman, Rabinovich, Rajendraprasad and Sohler [28].Specifically, given  ∈ (0, 1), an -testing algorithm for freeness accepts an input function  that is -free, and rejects if  differs from every -free function on at least  values. 1 The algorithm is given oracle access to the function  and the goal is to minimize the number of queries made by the algorithm.This problem is a generalization of the well-studied monotonicity testing on the line ((2, 1)-freeness), which was one of the first works in combinatorial property testing, and is still being studied actively [17,  Newman, Rabinovich, Rajendraprasad and Sohler [28] showed that for a general permutation  of length , the property of -freeness can be -tested using a nonadaptive 2 algorithm of query complexity  , ( 1−1/ ). 3 Additionally, they showed that, for nonadaptive algorithms, one cannot obtain a significant improvement on this upper bound for  ≥ 4. In a subsequent work, Ben-Eliezer and Canonne [7] improved this upper bound to  , ( 1−1/(−1) ), which they showed to be tight for nonadaptive algorithms.For monotone permutations  of length , namely, either (1, 2, . . ., ) or (,  − 1, . . ., 1), Newman et al. [28] presented an algorithm with query complexity ( −1 log ) ( 2 ) to -test -freeness.The complexity was improved, in a sequence of works [8,9], to  , (log ), which is optimal for constant  even for the special case of testing (2, 1)-freeness [19].
Despite the aforementioned advances in testing freeness of monotone permutations, improving the complexity of testing freeness of arbitrary permutations has remained open all this while.For arbitrary permutations of length at most 3, Newman et al. [28] gave an adaptive algorithm for testing freeness with query complexity ( −1 log )  (1) .However, the case of general  > 3 has remained elusive.In particular, the techniques of [28] for  = 3 do not seem to generalize even for  = 4.
As remarked above, optimal nonadaptive algorithms are known for any  [7], but, their complexity tends to be linear in the input length as  grows.For the special case of (2, 1)-freeness, it is well-known that adaptivity does not help at all in improving the complexity of testing [18,19].
Adaptivity is known to help somewhat for the case of testing freeness of monotone permutations of length , where, every nonadaptive algorithm has query complexity Ω( (log ) log  ) [8], and the  , (log )-query algorithm of Ben-Eliezer, Letzter, and Waingarten [9] is adaptive.Adaptivity significantly helps in testing freeness of arbitrary permutations of length 3 as shown by [28] and [7].

Our results
In this work, we give adaptive -testing algorithms for -freeness of permutations  of arbitrary constant length  with complexity Õ, (  (1) ).Hence, testing -freeness has quite efficient sublinear algorithms even for relatively large patterns.Our result shows a strong separation between adaptive and nonadaptive algorithms for testing pattern freeness.

Discussion of our techniques
Our algorithm has one-sided error and rejects only if it finds a -appearance in the input function  : [] → R. In the following, we present some of the main ideas behind a Õ( √ )query algorithm for detecting a -appearance in a function  that is -far from -free, for a permutation  of length 4. The case of length-4 permutations is not very different from the general case (where, we additionally recurse on problems of smaller length patterns).The Õ( √ ) queries algorithm is much simpler than the general one, but it outlines many of the ideas involved in the latter.Additionally, it already beats the lower bound of Ω( 2/3 ) on the complexity of nonadaptive algorithms for -freeness testing patterns of length 4 [7].A more detailed description appears in Section 3. The formal description of the general algorithm is given in Section 5.
For a parameter  ∈ (0, 1), a function  is -far from -free if at least  values of  need to be changed in order to make it -free.In other words, the Hamming distance of  to the closest real-valued -free function over [𝑛] is at least .A folklore fact is that the Hamming distance and the deletion distance of  to -freeness are equal, where the deletion distance of  to -freeness is the cardinality of the smallest set  ⊆ [𝑛] such that  restricted to [] \  is -free.
By virtue of this equality, a function that is -far from -free has a matching of -appearances of cardinality at least /4, where a matching of -appearances is a collection of -appearances such that no two of them share an index.This observation facilitates our algorithm and all previous algorithms on testing -freeness, including monotonicity testers.
The basic ingredient in our algorithms is the use of a natural representation of  : [] → R by a Boolean function over a grid [𝑛] × (  ), where (  ) denotes the range of  .Specifically, we visualize the function as a grid of  points in R 2 , such that for each  ∈ [], the pair (,  ()) is a point of the grid.We use   to denote this grid of points.This view has been useful in the design of approximation algorithms for the related and fundamental problem of estimating the length of Longest Increasing Subsequence (LIS) in a real-valued array [34,33,27,29].Adopting this view, for any permutation  : [] → [], a -appearance at ( 1 , . . .,   ) in  corresponds naturally to a -tuple of points (  ,  (  )),  = 1 . . . in   , for which their relative order (in   ) forms a -appearance.The converse is also true: every -appearance in the Boolean grid   corresponds to a -appearance in  .
We note that the grid   is neither known to, nor directly accessible by, the algorithm.In particular, (  ) is not assumed to be known.A main first step in our algorithm is to approximate the grid   by a coarser  ×  grid  , of boxes, for a parameter  = () that will determine the query complexity.The grid  , is defined as follows.Suppose that we have a partition of (  ) into  disjoint contiguous intervals of increasing values, referred to here as 'layers',  1 , . . .,   , and let  1 , . . .,   be a partition of [𝑛] into  contiguous intervals of equal size, referred to as 'stripes'.These two partitions decompose   and the  -points in it into  2 boxes and forms the grid  , .The (, )-th cell of this grid is the Cartesian product   ×   , and is denoted box(  ,   ).We view the nonempty boxes in  , as a coarse approximation of   (and of the input function, equivalently).The grid  , has a natural order on its boxes (viewed as While  , is also not directly accessible to the algorithm, it can be well-approximated very efficiently.We can do this by sampling Õ() indices from [𝑛] independently and uniformly at random and making queries to those indices to identify and mark the boxes in  , that contain a non-negligible density of points of   .This provides a good enough approximation of the grid  , .For the rest of this high-level explanation, assume that we have fixed  << , and we know  , ; that is, we assume that we know the number of points of   belonging to each box in  , , but not necessarily the points themselves.
If we find  nonempty boxes in  , that form a -appearance when viewed as points in the [] × [] grid, then   (and hence  ) contains a -appearance for any set of  points that is formed by selecting one point from each of the corresponding boxes.See Figure 1(A) for such a situation, for  = (3, 2, 1, 4).We first detect such -appearances by our knowledge of  , .
However, the converse is not true: it could be that   contains many -appearances, where the corresponding points, called 'legs', are in boxes that share layers or stripes, and hence do not form -appearances in  , .See e.g., Figure 1(B) for such an appearance for  = (3, 2, 1, 4).Thus, if the function is far from being -free and no -appearances are detected in  , , then there must be many -appearances in which some legs share a layer or a stripe in  , .In this case, the seminal result of Marcus and Tardos [26], implies that only () of the boxes in  , are nonempty.An averaging argument implies that if  is -far from being -free, then after deleting layers or stripes in  , with (1)-dense boxes, we are still left with a partial function (on the undeleted points) that is  ′ -far from being -free, for a large enough  ′ .
For the following high-level description we consider  = (3, 2, 1, 4), although all the following ideas work for any permutation of length 4. Any -appearance has its four legs spread over at most 4 marked boxes.This implies that there are only constantly many non-isomorphic ways of arranging the marked boxes containing any particular -appearance, in terms of the order relation among the marked boxes, and the way the legs of the -appearance are included in them.These constantly many ways are called 'configurations' in the sequel.Thus any appearance is consistent with a certain configuration.Additionally, in the case that multiple points in a -appearance share some marked boxes, this appearance induces the appearances of permutations of length smaller than 4 in each box (which are sub-permutations  of ).If a constant fraction of the -appearances are spread across multiple marked boxes, there will be many such -appearances in the marked boxes in the coarse grid.Hence, one phase of our algorithm will run tests for -appearances for smaller patterns  (which can be done in polylog  queries using known testers for patterns of length at most 3) on each marked box, and combine these -appearances to detect a -appearance, if any.This phase, while seemingly simple will require extra care, as combining sub-patterns appearances into a global -appearance is not always possible.This is a major issue in the general case for  > 4.
The simpler case is when there is a constant fraction of -appearances such that all 4 points of each such appearance belong to a single marked box.This can be solved by randomly sampling a few marked boxes and querying all the points in them to see if there are any appearances.The case that a constant fraction of the -appearances have their legs belonging to the same layer or the same stripe is an easy extension of the 'one-box' case.
To obtain the desired query complexity, consider first setting  = Õ( √ ).Getting a good enough estimate of  , as described above take Õ() = Õ( √ ) queries.Then, testing each box for -freeness, for smaller permutations  takes polylog  per test, but since this is done for all marked boxes, this step also takes Õ() = Õ( √ ).Finally, in the last simpler case, we may just query all indices in a sampled box that contains at most / = Θ( √ ) indices, by our setting of .This results in a Õ( √ )-query tester for -freeness.To obtain a better complexity, we reduce the value of , and, in the last step, we randomly sample a few marked boxes and run the algorithm recursively.This is so, since, in the last step, we are in the case that for a constant fraction of the -appearances, all four legs of each -appearance belong to a single marked box (or a constant number of marked boxes sharing a layer or stripe).The depth of recursion depends monotonically on / and the larger it is the smaller is the query complexity.The bound we describe in this article is  (1/log log log ) which is due to the exponential deterioration of the distance parameter  in each recursive call.Our algorithm for permutations of length  > 4 uses, in addition to the self-recursion, a recursion on  too.
Finally, we call -freeness or -freeness algorithms on marked boxes (or a collection of constantly many marked boxes sharing a layer or stripe) and not the entire grid.Since we do not know which points belong to the marked boxes, but only know that their density is significant, we can access points in them only via sampling and treating points that fall outside the desired box as being erased.This necessitates the use of erasure-resilient testers [16].Such testers are known for all permutation patterns of length at most 3 [16,29,28].In addition, the basic tester we design is also erasure-resilient, which allows us to recursively call the tester on appropriate subsets of marked boxes.
Some additional challenges we had to overcome: In the recursive algorithm for -length permutation freeness,  ≥ 4, we need to find -appearances that are restricted to appear in specific configurations, for smaller length permutations .To exemplify this notion, consider testing  = (1, 3, 2)-freeness.In the unrestricted setting,  : [] ↦ → R has a -appearance if the values at any three indices have a -consistent order.In a restricted setting, we may ask ourselves whether  is free of -appearances where the indices corresponding to the 1, 3-legs of a -appearance are of value at most /2 (that is in the first half of []), while the index corresponding to the 2-leg is larger than /2.This latter property seems at least as hard to test as the unrestricted one.In particular, for the -appearance as described above, it could be that while  is far from being -free in the usual sense, it is still free of having restricted -appearances.In our algorithm, we need to test (at lower recursion levels) freeness from such restricted appearances.This extra restriction is discussed at a high level in Section 3.For a formal definition of the restricted testing problem and how it fits into our final algorithm, see Section 5.

Open directions
Testing restricted -freeness: Testing for restricted -appearance, as described above, is at least as hard as testing -freeness.For monotone patterns (and hence 2-patterns) testing freeness and testing restricted appearances are relatively easy (can be done in polylog  queries).
For patterns of size 3 and more, the complexity of testing freeness of restricted appearances is currently open.
Weak -freeness: In the definition of -freeness, we required strict inequalities on function values to have an occurrence of the pattern.A natural variant is to allow weak inequalities, that is -for a set indices 1 ≤  1 <  2 • • • <   ≤  a weak--appearance is when for all ,  ∈ [] it holds that  (  ) ≤  (  ) if and only if () < ().Such a relaxed requirement would mean that having a collection of  or more equal values is already a -appearance for any pattern .For monotone patterns of length , the deletion distance equals the Hamming distance, for any , for this relaxed definition as well.We do not know if this is true for larger  for non-monotone patterns in general, although we suspect that the Hamming distance is never larger than the deletion distance by more than a constant factor.Proving this will be enough to make our results true for testing freeness of any constant size forbidden permutation, even with the relaxed definition.We show that the Hamming distance is equal to the deletion distance for patterns of length at most 4. Hence, Theorem 1.1 also holds for weak--freeness for  ≤ 4.
Another similarly related variant is when the forbidden order pattern is not necessarily a permutation (that is, an arbitrary function from [𝑘] to [𝑘] which is not one-to-one).For example, for the 4-pattern  = (1, 2, 3, 1), an -appearance in  at indices  1 <  2 <  3 <  4 is when  ( 1 ) <  ( 2 ) <  ( 3 ) and  ( 4 ) =  ( 1 ), as dictated by the order in .For testing freeness of such patterns, an Ω( √ ) adaptive lower bounds exist (by a simple probabilistic argument) even for the very simple case of (1, 1)-freeness, which corresponds to the property of being a one-to-one function.
An interesting point to mention, in this context, is that for testing freeness of forbidden permutations, a major tool that we use is the Marcus-Tardos bound [26].Namely, that the number of 1's in an  ×  Boolean matrix that does not contain a specific permutation matrix of order  is ().For non-permutation patterns, similar bounds are not true in general anymore, but do hold in many cases (or hold in a weak sense, e.g., only slightly more than linear).In such cases, the Marcus-Tardos bound could have allowed relatively efficient testing.However, the lower bounds hinted above for the (1, 1)-pattern makes the testing problem completely different from that of testing forbidden permutation patterns.

Restricted functions:
In this paper we always consider the set of functions  : [] ↦ → R with no restrictions.Interesting questions occur when the set of functions is more restricted.One natural such restriction is for functions of bounded or restricted range (for the special case of (2, 1)-freeness, such a study was initiated by Pallavoor, Raskhodnikova and Varma [30] and followed upon by others [6,29]).We do know that in the very extreme case, that is, for functions from the line [] to a constant-sized range, pattern freeness is testable in constant time even for much more general class of forbidden patterns [4].Apart from this extreme restriction, or the results for 2-patterns stated above, we are not aware of results concerning functions of bounded range (e.g., range that is  2 or √ ).Lastly, if we restrict our attention to functions  : [] → [] that are themselves permutations, Fox and Wei [21] argued that for some special types of distance measures such as the rectangular-distance and Kendall's tau distance, testing -freeness can be done in constant

Other open questions:
The major open question left in this paper is to determine the exact (asymptotic) complexity of testing -freeness of arbitrary permutations  : [] → [],  ≥ 3.
While the gaps for  = 3 are relatively small (within polylog  range), the gaps are yet much larger for  ≥ 4. We do not have any reason to think that the upper bound obtained in this draft is tight.We did not try to optimize the exponent of  in the Õ(  (1) ) expression, but the current methods do not seem to bring down the query complexity to polylog .We conjecture, however, that the query complexity is polylog  for all constant .Another open question is whether the complexity of two-sided error testing might be lower than that of one-sided error testing.
Finally, Newman and Varma [29] used lower bounds on testing pattern freeness of monotone patterns of length  ≥ 3 (for nonadaptive algorithms), to obtain lower bounds on the query complexity of nonadaptive algorithms for LIS estimation.Proving any lower bound better than Ω(log ) for adaptively testing freeness, for arbitrary permutations of length  for  ≥ 3, may translate in a similar way to lower bounds on adaptive algorithms for LIS estimation.
Organization: Section 2 contains the notation, important definitions, and a discussion of some key concepts related to testing -freeness.Section 3 contains a high-level overview of an Õ( √ )-query algorithm for patterns of length 4. The formal description of our -freeness tester for permutations  of length  ≥ 4 and the proof of correctness appear in Section 5.

Preliminaries and discussion
For a function  : [] → R, we denote by (  ) the image of  .We often refer to the elements of the domain [𝑛] as indices, and the elements of (  ) as values.For  ⊆ [],  |  denotes the restriction of  to .Throughout,  will denote the domain size of the function  .
We often refer to events in a probability space.For ease of representation, we will say that an event  occurs with high probability, denoted 'w.h.p.', if Pr() > 1 −  − log  , to avoid specifying accurate constants.
Let S  denote the set of all permutations of length .We view  = ( 1 , . . .,   ) ∈ S  as a function (and not as a cyclus), that is, where () =   ,  ∈ [].We refer to   as the th value in , and as the   -leg of .Thus, e.g., for  = (4, 1, 2, 3), the first value is 4, and the third is 2, while the 4-leg of  is at the first place and its 1-leg is at the second place.We often refer to  ∈ S  as a -pattern.

Deletion distance vs. Hamming distance
The distance of a function from the property of being -free can be measured in several ways.
In this paper, we use Hamming and deletion distances as are defined next.For 0 ≤  < 1 we say that  is -far from -freeness in deletion distance, or Hamming distance, if   (  ) ≥ , and otherwise we say that  is -close to -freeness, where   (  ) is the corresponding distance.
It is obvious from the definition that Ddist  (  ) ≤ Hdist  (  ).For the other direction, assume that Ddist  ( , where [ − 1] ⊆ , by definition of .Now, the deletion distance of  ′′ is less than  and we are back to the case that the smallest index being deleted is greater than 1.

■
Claim 2.2 is extremely important for testing -freeness, and is what gives rise to all testers of monotonicity, as well as -freeness that are known.This is due to the fact that the tests are really designed for the deletion distance, rather than the Hamming distance.The folklore observation made in Claim 2.3 facilitates such tests, and Claim 2.2 makes the tests work also for the Hamming distance.Due to Claim 2.2, we say that a function  is -far from -free without specifying the distance measure.
Let  ∈ S  and  : [] → R. A matching of -appearances in  is a collection of appearances that are pairwise disjoint as sets of indices in [𝑛].The following claim is folklore and immediate from the fact that the size of a minimum vertex cover of a -uniform hypergraph is at most  times the cardinality of a maximal matching.
far from being -free, then there exists a matching of -appearances of size at least /.
All our algorithms have one-sided error, i.e., they always accept functions that are -free.
For functions that are far from being -free, using Claim 2.3, our algorithms aim to detect some -appearance, providing a witness for the function to not be -free.Hence, in the description below, and throughout the analysis of the algorithms, the input function is assumed to be -far from -free.We refer to an index-value pair (,  ()),  ∈ [] in the grid as a point.The grid has  points, to which our algorithms do not have direct access.In particular, we do not assume that (  ) is known.The function is one-to-one if |(  )| = .Note that if  is a matching of -appearances in  , then  defines a corresponding matching of -appearances in   .We will always consider this alternative view, where the matching  is a set of disjoint -appearances in the grid   .

Coarse grid of boxes
For a pair of subsets (, ), where  ⊆ [] and  ⊆ (  ), we denote by box(, ), the subgrid  ×  of   alongwith with the set {(,  ()) :  ∈ ,  () ∈  } of points in   .In most cases,  and  will be intervals in [𝑛] and (  ), respectively, and hence the name box.that is isomorphic to the grid [] × [].Note that box(, ) could even be the entire grid   .
We say that layer  is below layer  ′ , and write  <  ′ , if the largest value of a point in  is less than the smallest value of a point in  ′ .For stripes St(), St( ′ ), we write St() < St( ′ ) if the largest index in  is smaller than the smallest index in  ′ .For the grid  , and two boxes

Patterns among and within nonempty boxes
Consider a coarse grid of boxes,  , , defined as above on the grid of points   .There is a natural homomorphism from the points in   to the nonempty boxes in  , where those points fall.For  and a grid of boxes  , as above, we refer to this homomorphism implicitly.This homomorphism defines when  , contains a -appearance in a natural way.For example, consider the permutation  = (3, 2, 1, 4) ∈ S 4 .We say that  , contains  if there are nonempty boxes  1 ,  2 ,  3 ,  4 such that St( 1 ) < St( 2 ) < St( 3 ) < St( 4 ) and ( 3 ) < ( 2 ) < ( 1 ) < ( 4 ) (see Figure 1(A)).

O B S E R VAT I O N 2 . 4.
Let L, S be a partition of   into layers and stripes as above, with |L| = , |S| = .If  , contains  then   (and equivalently  ) has a -appearance.
The converse of Observation 2.4 is not true;   may contain a -appearance while  , does not.This happens when some of the boxes that contain the -appearance share a layer or  For  ∈ S  , a -appearance in   implies that the  points corresponding to such a - To sum up, each -appearance in   defines an arrangement of nonempty boxes in  , that contain the legs of that appearance.This arrangement is defined by the relative order of the layers and stripes among the boxes, and has at most  components.Such a box-arrangement that can contain the legs of a -appearance is called a configuration.Note that there may be many different -appearances in distinct boxes, all having the same configuration C. Namely, in which, the arrangements of the boxes in terms of the relative order of layers and stripes are identical.So, every set of ℓ ≤  points in the  ×  grid defines a configuration and two such sets represent the same configuration if they are order-isomorphic with respect to the grid order.
For  ∈ S  , let () be the number of all possible configurations that are consistent with a -appearance. , which is at most 2 ( log ) .

■
A configuration C does not fully specify the way in which a -appearance can be present.
It is necessary to also specify the way the  legs of the -appearance are partitioned among the boxes in a copy of C. Let B denote a set of boxes forming the configuration C. Let  : [] → B denote the mapping of the legs of the -appearance to boxes in B, where ( ),  ∈ [] denotes the box in B containing the -th leg of the -appearance.We say that the copy of C formed by the boxes in B contains a -legged -appearance.
A configuration C in which the boxes form  ≥ 2 components, and that is consistent with a -appearance, defines  1 , . . .,   -appearances, respectively, in the  components of C, where   for  ∈ [] is the subpermutation of  that is defined by the restriction of  to the -th component.In addition, C defines the corresponding mappings   ,  = 1, . . ., of the corresponding legs of each   to the corresponding boxes in the th component.For example, consider  = (3, 2, 1, 4) and the box arrangement shown in Figure 1(F).That arrangement has two connected components: one that contains  1 ,  4 and the other that contains  2 ,  3 , where we number the boxes from left to right (by increasing stripe order).Further, the (only) consistent partition of the legs of  into these boxes is () ∈   ,  ∈ [4].In particular, it means that the component formed by  1 ,  4 contains the 3, 4 legs of  and the component formed by  2 ,  3 contains the 2, 1 legs of .Thus, in terms of the discussion above, the component formed by  1 ,  4 has a  1 = (1, 2)-appearance (corresponding to the 3, 4 legs of ), with leg mapping  1 mapping the 1-leg into  1 and the 2-leg into  4 .Similarly, the component formed by  2 ,  3 has a  2 = (2, 1)-appearance (corresponding to the 2, 1 legs of ) with corresponding leg mapping  2 that maps the 2-leg into  2 and the 1-leg into  3 .Note that the converse is also true: every  1 -appearance in the component  1 ∪  4 , with a leg-mapping  1 (that is, in which the 1, 2 legs are in  1 ,  4 respectively), in addition to a  2 -appearance in  2 ∪  3 with the leg-mapping  2 , results in a -appearance in  , .This leads to the crucial observation that if  defines the corresponding  1 , . . .,   appearances in the  components of the configuration C, then, any  1 , . . .,   -appearances in the  components of any copy of C with consistent leg-mappings is a -appearance in C.This is formally stated below.We point out that the definition in [16] is for any property and for two-sided error testing as well.

D E F I N I T I
Dixit et al. [16] give a one-sided error -ER -tester for monotonicity of functions  : [] → R with query complexity ( log   ) that works for any constants ,  ∈ [0, 1).It can be observed that the polylog -query one-sided error tester for -freeness of [28], for any  ∈ S 3 , is also ER.
As part of our algorithm for testing -freeness for  ∈ S  for  ≥ 4, we call testers for smaller subpatterns on subregions of the grid   which may be defined by, say, box(, ) for some  ⊆ [],  ⊆ (  ).In this case, the only access to points in box(, ) is by sampling indices from  and checking whether their values fall in .If the values do not fall in , we can treat them as erasures.Given the promise that the number of points falling in box(, ) is a constant fraction of ||, we can simply run ER testers on  |  to test for these smaller subpatterns.

High-level description of the basic algorithm for 𝝅 ∈ S 4
In this section, we give a high-level description of most of the ideas used in the design of our -freeness tester of query complexity Õ(  (1) ).We first describe the ideas behind a Õ( √ )-query -tester for -freeness of functions  : [] → R, where  ∈ S 4 and  ∈ (0, 1).At the end of this section, we briefly touch upon how to generalize these ideas to obtain the query complexity of Õ(  (1) ) for constant-length permutations of length at least 4. For simplicity, we assume in what follows that the input function  : [] → R is one-to-one.The algorithm for functions that are not one-to-one differs in a few places and these are explained in Section 5.1.
For the purposes of this high-level description, we fix the forbidden permutation  = (3, 2, 1, 4).The same algorithm works for any  ∈ S 4 .We view  as an (implicitly given) It is either the case that the dense boxes contain all but an insignificant fraction of the points in   , or the total number of nonempty boxes is larger than  ′ log .
Next, we use the following lemma of Marcus and Tardos.

L E M M A 3 .1 ([26]
).For any  ∈ S  ,  ∈ N, there is a constant () ∈ N such that for any  ∈ N, if a grid  , contains at least () •  marked points, then it contains a -appearance among the marked points.
Let  = (4).Using Lemma 3.1, we may assume that there are at most  •  ′ nonempty boxes in   ′ , ′ , as otherwise, we already would have found a -appearance in   ′ , ′ , which by Observation 2.4, implies a -appearance in   and in  as well.Hence, as a result of the gridding, if we do not see a -appearance among the sampled points, the second item above implies that there are Θ( ′ ) dense boxes in   ′ , ′ and that these boxes cover all but an insignificant fraction of the points of   .
An averaging argument implies that, for an appropriate value  =  (), only a small fraction (depending on ) of layers (or stripes) contain more than  nonempty boxes.Therefore, since the grid   is -far from being -free, the restriction of   to the layers and stripes that contain at most  boxes each, is also  ′ -far from -free for a large enough  ′ < .This implies that   restricted to the points in dense boxes that belong to layers and stripes containing at most  dense boxes each, has a matching  of -appearances of size at least  ′ /4.We assume in what follows that this is indeed the situation.
An important note at this point is that every dense box  is contained in ( 3 ) many copies of 1-component configurations with at most 4 dense boxes.This implies that there are ( 3 ) Recall that every -appearance in  defines a configuration of at most 4 components in   ′ , ′ .Hence, the matching  of size | | = Ω  () can be partitioned into 4 sub-matchings  =  1 ∪  2 ∪  3 ∪  4 , where   ,  = 1, . . ., 4 consists of the -appearances participating in configurations having exactly  components.Since | | = Ω  () it follows that at least one of   ,  = 1, 2, 3, 4 is of linear size.Now, any -appearance in  4 is an appearance in 4 distinct dense boxes in   ′ , ′ , where no two share a layer or a stripe.In that case, such an appearance can be directly detected from the tagged   ′ , ′ with no further queries.
The description of the rest of the algorithm can be viewed as a treatment of several independent cases regarding which one among the constantly many configuration types contributes the larger mass out of the Ω  () -appearances in  1 ∪  2 ∪  3 .There are only two significant cases, but to enhance understanding, we split these two cases into the more natural larger number of cases, and observe at the end that most cases can be treated conceptually in the same way.This can be done in (polylog ) queries (e.g., [8]).Then once finding a (3, 2, 1) in  1 for which (a) and (b) hold,  1 ∪  2 contains a -appearance.

Case 1:
We note here that for the example above, we ended by testing for (3, 2, 1)-freeness which is relatively easy.For a different configuration or , we might need to test  1 for a different  ∈ S 3 , but this can be done for any  ∈ S 3 using (polylog ) queries [28].Hence the same argument and complexity guarantee hold for any 2-component configuration C as above.
Case 4: A more complicated situation arises when | 2 | ≥  ′ /3, and the corresponding configurations of the -appearances in  2 are formed of two components , , with  holding 3 legs of  in 2 or 3 boxes (rather than in one box as in Case 3).E.g.,  = (4, 2, 1, 3), and the configuration C as illustrated in Figure 1(E).
By a similar averaging argument to that made in Case 2, it follows that there is a dense box  1 for which (a) there are dense boxes  2 ,  3 forming a copy  ′ of  with  1 , and a dense box  such that the configuration formed by  ′ ,  is a copy of C, and (b) there are Ω  (/) = Ω  ( √ ) -legged (3, 2, 1)-appearances in  ′ , where  is consistent with the leg mapping that is induced by the configuration C.This implies a conceptually similar test to that of the simpler Case 3 above -we test each of the () components  for (3, 2, 1)-freeness, and then with the existence of the corresponding box  we find a -appearance.However, this is not perfectly accurate: the algorithm for finding  = (3, 2, 1) in  ′ , although efficient, might find a (3, 2, 1)appearance where the 3 legs appear in  1 or in  1 ∪  2 .But this does not extend with  to form a -appearance, as the leg mapping is not consistent with the one that is induced by C.
Namely, unlike before, we do not only need to find a -appearance in  but rather a -legged -appearance with respect to a fixed mapping  (that in this case maps each leg to a different box in the component  ′ ).
There are several ways to cope with this extra restriction.For the current description of a basic Õ( √ ) algorithm, it is enough to sample a constant number of copies of the component  and do the test for -legged -appearance in each.But, since each copy  ′ is of size ( √ ) we can afford to query all indices in the domain of  ′ .
To resolve the problem in the general setting, we need to efficiently detect -legged appearances in multi-boxed components.This, however, we currently do not know how to do.Instead, we design a test that either finds a -legged -appearance, or finds the original -appearance.This is done using the algorithm AlgTest  (, , , , ) that will be described in Section 5.

Case 5:
The last case that we did not consider yet is when most of the -appearances are in a configuration containing more than one component, with at least two components containing two (or more) legs each.For  ∈ S 4 the only such case is when the configuration C contains exactly two components, each containing exactly two legs of .Returning to our working example with  = (3, 2, 1, 4), such an example is depicted in Figure 1(F).For the explanation below, we will discuss the case that the configuration C is as in Figure 1(F).Namely, it contains components  1 that is above  2 , with two boxes each  1 = { 1 ,  4 } and  2 = { 2 ,  3 }, and so that every box contains exactly one leg of  (boxes are numbered by order from left to right in   ′ , ′ ).Our goal is to find two copies  ′ 1 ,  ′ 2 of the components  1 ,  2 respectively, that form a copy of C, and to find a  1 -legged appearance of (1, 2) in  ′ 1 , and a  2 -legged appearance of (2, 1) in  ′ 2 , so that these two appearances will together form a -appearance.Indeed, an averaging argument shows that there are  ′ 1 ,  ′ 2 as above, with  ′  containing Ω  (/)   -legged appearances of   for  = 1, 2. However, we do not know whether sampling a pair  ′ 1 ,  ′ 2 in some way, will result in such a good pair.Rather, we are only assured of the existence of only one such pair!Hence, in this case we need to test every component copy  ′ of the appropriate type, for every  ∈ S 2 , and for every leg mapping , for a -legged -appearance in  ′ in order to find such an asserted pair of components.Such restricted -appearances can be tested in (log ) queries per component.Since the number of two-boxed component copies where both boxes belong to the same layer is (), this step takes Õ() queries in total.
The same argument holds for any  ∈ S 4 , and for every configuration that is consistent with Case 5.

Concluding remarks
At some places in the algorithm above, we had to test for -appearances (or restricted -appearances) in 'dense' subgrids of   .For this, we need all our algorithms to be ER, which will be implicitly clear from the description.We also need to take care of reducing the total error when we run a non-constant number of tests, or want to guarantee a large success probability for a large number of events -this is done by a trivial amplification that results in a multiplicative polylog  factor.
In Case 1, we reduced the problem of finding a -appearance in   that is assumed to be -far from -free, to the same problem on a subrange of the indices (formed by a small component) of size Θ(/) (with a smaller but constant distance parameter  ′ < ).For the setting of  = √ , solving the problem on the reduced domain was trivially done by querying all indices in the subrange.In the general algorithm, where our goal is a query complexity of   (1) , we set  =   for an appropriately small  and apply self-recursion in Case 1.
In Case 5, we had to test for -freeness (or for restricted -appearances) for  ∈ S 2 for every small component of size Θ(/) in   ′ , ′ .This entails a collection of () tests, where we want to assign a large success probability to each one of them.We also need to guarantee a large success probability to correctly tagging each of the Θ( 2 ) boxes as part of the layering procedure.A similar need will also arise in the general algorithm.We amplify the success probability by multiplying our number of queries by log 2  which will imply less than 1/ Ω(log ) failure probability for each individual event in such collection.
We will not comment more on this point, and assume implicitly that in all such places, all needed events occur w.h.p.
In Cases 2, 3, 4 we end up testing -freeness for  ∈ S 2 ∪ S 3 in dense boxes, or -legged -freeness of such  in components of multiple dense boxes.An averaging argument shows that this can simply be done by sampling one box or component, and making queries to all indices therein.
Case 5 is different: here, sampling a small number of components does not guarantee an expected large number of the corresponding appearances.This is the reason that we need to test all components with at most 2 dense boxes, for -legged -freeness, and for every  ∈ S 2 and leg mapping .Algorithm AlgTest  (, , , , ) can do this for any  ∈ S 2 ∪ S 3 in   queries for an arbitrarily small constant .Since we have to do it in Case 5, we may do the same in cases 2, 3, 4 as well!As a result, the algorithm above will contain only two cases: Case 1 where we reduce the problem to the same problem but on a smaller domain, and the new Case 2 where we test every small component for -legged -appearance for every  ∈ S 2 ∪S 3 and every leg mapping  -namely a case in which we reduce the problem to testing (restricted appearances) for smaller patterns.
In view of the comment above, the idea behind improving the complexity to   for constant 0 <  < 1 is obvious: Choosing  =  /2 will result in an  ×  grid, where Layering can be done in Õ( /2 ) queries.Then, Case 2 will be done in an additional   queries by setting a query complexity for AlgTest  (, , , , ) to be  /2 per component.The self-recursion in Case 1 will result in the same problem over a range of /.For the fixed  =  /2 , this will result in a recursion depth of 2/, after which the domain size will drop down to  and allow making queries to all corresponding indices.This results in a total of Õ(  ) queries, including the amplification needed to account for the accumulation of errors and deterioration of the distance parameter at lower recursion levels. ′ ∈ S  ′ ,  ′ < , or, self-reducing the problem for finding -legged -appearance but in a sub-component  ′ whose size is a factor  smaller than that of the size of .This is done in a similar way to what is described above in Case 1.

Generalized testing and testing
In summary, the algorithm for GeneralizedTesting  () is very similar to the algorithm for testing -freeness, with the same two cases, where Case 2 becomes recursion to finding appearances of a smaller permutation, and where the base case is for permutations of length 2. As we show in Section 5, formally, GeneralizedTesting  () strictly generalizes testing -freeness, and hence, the formal algorithm for testing -freeness will be a special case of GeneralizedTesting  ().

Gridding
In this section, we describe an algorithm that we call Gridding (Algorithm 2), which is a common subroutine to all our algorithms.The output of Gridding, given oracle access to the function  : [] → R and a parameter  ≤ , is an  ×  grid of boxes that partitions either the grid   defined by  or a region inside of it into boxes, with the property that the density of each box, which we define below, is well controlled. ′ ≤ 2,  1 , . . .,   ′ are pairwise disjoint, and ∈[ ′ ]   = .In particular, the largest value in   is less than the smallest value in   ′ for  <  ′ .
for  ∈ [ ′ ], either den(,   ) < 4  OR   contains exactly one value and is such that den(,   ) ≥ 1 2 .In the first case, we say that box(,   ) is a single-valued layer of box(, ), and in the second case, we say that box(,   ) is a multi-valued layer of box(, ).

Layering
The main part of Gridding is an algorithm Layering which is described in Algorithm 1.A similar algorithm was used by Newman and Varma [29] for estimating the length of the longest increasing subsequence in an array.Layering(, , ), given  ⊆ [],  ⊆ (  ),  ≤  as inputs, and outputs, with probability at least 1 − 1/ Ω(log ) , a set I of intervals that is a nice -partition of box(, ).It works by sampling Õ() points from box(, ) and outputs the set I based on these samples.Note that both the sets  and  are either contiguous index/value intervals themselves or a disjoint union of at most  such contiguous intervals.Additionally, we always apply the algorithm Layering to boxes of density Ω(1/log ).
C L A I M 4 .3. If den(, ) > 1/log , then with probability 1 − 1/ Ω(log ) , Layering(, , ) returns a collection of intervals I = {  }  ′ =1 such that I is a nice -partition of box(, ).Furthermore, it makes a total of  log 4  queries.If  1 > / then  1 will contain only  1 , otherwise  1 will contain the maximal subsequence ( 1 , . . .,   ) whose sum is at most 2/.We then delete the members of  1 from  and repeat the process.For  ∈ [ ′′ ], let (  ) denote the total weight in   .
Correspondingly, we obtain a partition of the sequence seq of sampled values into at most  ′′ subsequences {seq  } ∈[ ′′ ] .Some subsequences contain only one value of weight at least / and are called single-valued.The remaining subsequences are called multi-valued.
6: For  ∈ [ ′′ ], if the interval   is the disjoint union of two contiguous intervals  (1)    and  (2)   , then drop such an interval   from consideration.⊲ This situation can arise since  is the disjoint union of several contiguous intervals and hence   can contain points from two such consecutive and contiguous subintervals of .In this case, by definition, does not fail in Step 2. In the rest of the analysis, we condition on this event happening.
To prove that  ′ ≤ 2, it is enough to bound  ′′ , which is the total number of intervals formed before some multi-valued intervals are dropped at the last step.The total number of intervals of weight at least / is at most  since the total weight is .Other intervals have weight less than / and for each such interval   , it must be the case that  −1 and  +1 are of weight at least /.It follows that  ′ ≤  ′′ ≤ 2.
We now prove that the family I output by Layering is a nice -partition of box(, ).The probability that less than 2/ points from the sample have values in the range [, ] is at most 1/ Ω(log ) by a Chernoff bound.Conditioning on this event implies that for every   ,  ∈ [ ′ ] output as a multi-valued interval by the algorithm, we have den(,   ) < 4  .Finally, for a single- , with probability at least 1 − 1/ Ω(log ) , we have den(, [, ]) ≤ 3  2 den(, [, ]) < 3 4 , where den(,  ′ ) denote the estimated density (as estimated in Algorithm 1) for a layer box(,  ′ ) when  ′ ⊆ .Conditioning on this event implies that for every   ,  ∈ [ ′ ] output as a single-valued interval by the algorithm, we have Finally, the number of layers that get dropped is at most , each of them is multi-valued and hence, conditioning on the above events, the density of points lost in this process is at most (1).Putting all of this together, we can see that the layers form a nice -partition of box(, ).
The claim about the query complexity is clear from the description of the algorithm.■

Gridding
Next, we describe the algorithm Gridding (see Algorithm 2).
We note that initially, at the topmost recursion level of the algorithm for -freeness, we call Gridding with  = [],  = (−∞, +∞) and our preferred  which is typically  =   , for some small  < 1.
We prove in Claim 4.4 that running Gridding(, , ) results in a partition of box(, ) into a grid of boxes   ′ , ′ in which either the marked boxes contain a -appearance, or the union of points in the marked boxes contain all but an  fraction of the points in   , for  << .
Additionally, with high probability, all boxes that are tagged dense have density at least   The set of intervals corresponding to the layers of   ′ , ′ form a nice -partition of .The third item follows by a simple application of the Chernoff bound followed by a union bound over all stripes.
For the second item, fix a stripe   of   ′ , ′ .Let  ⊆ [ ′ ] be the set of all  ∈ [ ′ ] such that box(  ,   ) gets marked during Step 3 in Gridding.If ∈ den(  ,   ) ≥ 1 − 1/(log 2 ) then we are done.Otherwise, each query independently hits a box that is not marked by any of the previous queries with probability greater than 1/(log 2 ).Thus, the expected number of boxes marked is at least log 2 / 2 .Chernoff bound implies that, with probability at least 1 −  −Ω(log ) , at least log 2  100 2 boxes are marked.The union bound over all the stripes implies the second item.

Generalized testing of forbidden patterns
In this section, we formally define the problem of testing (or deciding) freeness from -appearances with a certain leg-mapping.We then provide an algorithm for a relaxation of this testing problem.Our algorithm for testing -freeness is based on this.A description of the algorithm, and a proof sketch for the case of patterns of length 3 for specific leg-mappings is provided in Section 5.1.1.It illustrates some of the ideas for the general case, and it might be easier to follow.This is followed by an algorithm and a correctness proof for the most general case.
Recall that   denotes the  × |(  )| grid that represents the input function  : [] → R.
Let  ℓ,ℓ be a partition of   into a grid of boxes for an arbitrary ℓ ≥ 1, and  be a connected component in  ℓ,ℓ containing  boxes  1 , . . .,   .Let  ∈ S  , and let  : [] ↦ → { 1 , . . .,   } be an arbitrary mapping of the legs of  into the boxes of , where  ≤ .We say that 1 ≤  1 < . . .<   ≤  is a -legged -appearance if ( 1 , . . .,   ) forms a -appearance in   such that the point (  ,  (  )) is contained in the box ( ) for each  ∈ [].That is, the legs of the -appearance are mapped into the boxes given by .St(  )| points belonging to  must be modified in order to make  free of -legged -appearances, where St() for a box  denotes the stripe corresponding to .Note that a function could be -legged -free but very far from being -free.For example, for the  referred to above in Figure 1(B), it could be that there are many appearances of (3,2,1,4) which are all in the left box or all in the right box or both, but there are no appearances with the leg mapping .
The property of being free of -legged -appearances is a generalization of the property of -freeness.Taking ℓ = 1,  ℓ,ℓ is just   itself viewed as one single box .When  =  and  is the constant function that maps each leg to the unique box , any -appearance in   is a -legged -appearance.
The problem of testing -legged -freeness was not previously explicitly studied and we believe that it is an interesting research direction in its own right.Even though its complexity is not known, we encounter it only as a subproblem in the testing of standard -freeness.This motivates the following definition.The (1, 3, 2)-appearances in the green boxes in Figure 2  We further note that our algorithm does not use any structure of .•  / ), for integer  ≤ log log log .

Proof of Correctness
In Section 5.1.1,we start with a description of the algorithm and the proof sketch for the first non-base case of testing -legged -freeness for  ∈ S  ,  = 3, with respect to an arbitrary  ∈ S  and fixed  ≥ 4. In Section 5.1.2,we present the proof of Theorem 5.3.

An example for 𝜈 ∈ S 3
For this exposition, we fix  = (1, 3, 2), and  being composed of 2 boxes  1 ,  2 in the same layer, where  1 is to the left of  2 , and  maps the 1, 3 legs of  to  1 , and the 2-leg to  2 .See Figure 2(D) for an illustration of one such case.In the figure, the green boxes represent  1 and  2 .The orange boxes indicate the subboxes in the finer grid formed when gridding is called on the green boxes.
We note that Figure 2(D) illustrates the hardest case for  ∈ S 3 .There are additional one-component configurations in which the boxes are in the same stripe or layer, but these turn out to be much easier.We will set  = () to be defined later and express the complexity as a function of .In Step 6 of Algorithm 3, we test each of the () many copies of  1 for a  ′ -legged (2, 1)appearance for which  ′ (2) =  1,3 and  ′ (1) =  2,2 .Then for any such  1 -copy in which such a  ′ -legged (2, 1)-appearance is found, any nonempty dense box  1,1 forming with  1 a copy of C 2 results in a -legged -appearance.
Since this is a reduction to generalized 2-pattern appearance, the recursion stops here with (log )-complexity per copy of  1 .Hence, altogether this will contribute a total of Õ() queries.Procedures along the same lines work for any of C  ,  = 1, 2, 3, 4.
If a desired -legged -appearance (or a -appearance) is found in the above process, then clearly a correct output is produced.
On the other hand, if indeed ( 1 ,  2 ) contains Ω() (that is, linear in the size of  1 ∪  2 ) many -legged -appearances that are consistent with one of the configurations C  ,  ∈ [4], then, by an averaging argument, there will be such a  1 and corresponding  1,1 that together contribute Ω(/) (that is, linear in the domain size of  1 ) such subpattern appearances.
We note that for the more general case of  > 3, the reduction will be done in higher complexity per component (that is dependent on  rather than just (log )).
3. Consider now a consistent configuration C  for  = 5, 6, 7, 8 that forms a single component (with 2 or 3 orange subboxes) as illustrated in Figure 2(E)-(H).In these cases, if such appearances contribute  ′ to the total distance, then a simple averaging argument shows that for a uniformly sampled component, its distance from -legged -freeness will be linear.Hence in Step 7, sampling such a component will enable us to recursively find a -legged -appearance with high probability.Since the size of a component on which the recursive call is made is Θ(/), the complexity of this step is Õ((/,  ′ )), where (, ) is the complexity of the algorithm, for the case of  ∈ S 3 , in terms of the size  of , and a distance parameter .
Correctness.The correctness of the algorithm follows from the fact that if  is indeed far from being -legged -free, then it must be that there are linearly many -legged -appearances in at least one of the 8 configurations discussed above, and for each case, either a -appearance or a -legged -appearance is found, by induction.Note however, that there is a drop in the distance parameter from  to  ′ , due to the deletion of points in Step 5 of the algorithm, and the averaging arguments resulting in the call with smaller distance parameters at Step 6 and Step 7.
This does not matter as long as  ′ is kept constant (or even 1/log ), forcing the recursion depth to be bounded from above by a constant.

Query complexity.
We now analyze the query complexity of the algorithm for the special case described above.The parameter  is to be interpreted as the query budget of the algorithm.We abuse notation and use  to indicate the total number of indices that the component  contains.
Let  be the smallest integer such that   ≥ .This parameter  denotes the recursion depth of our algorithm and we express our recurrence relation in terms of .Let (, ) denote the query complexity of the above algorithm with parameter  for functions over a domain of size  ≤   .We omit the dependence of the query complexity on  and assume that  = Θ(1) for the purposes of this high level description.
For the base case, we have  = 1.Then, (, 1) =  = Θ() since the algorithm can query all the indices and still be within the query budget.
If  > 1, ignoring polylog factors, we have (, ) =  +  + (,  − 1).The first summand here is the number of queries made by the Gridding.The second summand is the number of Recall that we assume that  is -far from being -legged -free.This implies that it contains a matching of -legged -appearances of size at least ||/.For the rest of this proof, we fix such a matching .Since each deleted point deletes at most 1 member from , there is a matching  ′ of cardinality at least ||  − || 10 ≥ 9|| 10 with all legs in the set of dense boxes remaining after Step 5.

■
We can partition  ′ into a collection of disjoint matchings  ′ = ∈[]   , where   contains the -legged -appearances in  ′ belonging to configuration copies in   ′ , ′ that have  components.Recall that all the legs of every -legged -appearance in  ′ belong to the single component  made of the boxes  1 , . . .,   .However, with respect to the grid   ′ , ′ , each such -appearance has a corresponding leg mapping that maps the legs of the appearance to boxes in   ′ , ′ , which are actually subboxes of  1 , . . .,   .Some of the leg mappings of -appearances to subboxes might result in configurations with multiple components in the finer grid   ′ , ′ .It follows that either  1 or ∈[−1]  +1 has cardinality at least 9|| 20 .Let  1 = 9/(20).Therefore, in expectation, a uniformly random copy of a 1-component configuration with at most  boxes contains at least ′ •(−1)!(2) −1 many -appearances from  1 .These -appearances each could have different leg mappings that are each (, )-consistent (see Definition 5.2).There query complexity.Testing -freeness w.r.t. the Hamming or deletion distances is very different, and still remains open for this setting.

Let
: [] → R. We view  as points in an  × |(  )| grid   .The horizontal axis of   is labeled with the indices in [].The vertical axis of   represents the image (  ) and is labeled with the distinct values in (  ) in increasing order,  1 <  2 < . . .<   ′ , where |(  )| =  ′ ≤ .

Figure 1 .
Figure 1.Each rectangle represents a different grid   , where the green shaded boxes correspond to some nonempty boxes in those grids.Each figure represents a different configuration type with respect to the appearance of some 4-length pattern.The dots and the numbers indicate possible splittings of the 4 legs of . Figure (E) represents the pattern (4, 2, 1, 3) and all others represent the pattern (3, 2, 1, 4).The sizes of green boxes in the figures are not representative and are not drawn to scale.
For ,  ∈ [] such that  < , consider the -th and -th leg in the order from left to right along the grid, in the union of   -legged   -appearances in  ′  for  ∈ [].By the above statement and by virtue of the leg mappings   ,  ∈ [], the relative values of the -th and -th legs in the aforementioned union of appearances is identical to the relative values of the -th and -th legs in the -appearance occurring according to the configuration C.

D E F I N I T I O N 4 . 1 (
Density of a box).Consider index and value subsets  ⊆ [] and  ⊆ (  ), respectively.The density of box(, ), denoted by den(, ), is the number of points in box(, ) normalized by its size ||.D E F I N I T I O N 4 .2 (Nice partition of a box).For index and value sets  ⊆ [] and  ⊆ (  ) and parameter  ≤ , we say that I = { 1 ,  2 , . . .,   ′ } forms a nice -partition of box(, ) if: It is clear from the description of Algorithm 1 that the intervals output by the algorithm are disjoint.Let B = {[, ] : ,  ∈  and ∃,  ∈  such that  () = ,  () = } denote the set of all true intervals of points from box(, ).Consider an interval [, ] ∈ B such that den(, [, ]) ≥ 4  .

4 :
Return the grid   ′ , ′ along with the tags on the various boxes.

P
R O O F .The bound on query complexity as well as the first item follows directly from Claim 4.3.
For example, consider Figure 1(B),  = (3, 2, 1, 4), and  the component formed by the two boxes in the same layer.The function  maps the 3-leg and 2-leg of the -appearance to the left box and the 1-leg and 4-leg to the right box.The connected component  is -legged -free if it contains no -legged -appearances.It is -far from being -legged -free if the values of at least  • | ∈[] illustrate this.These appearances can belong to various possible configurations upon further gridding of the two boxes as illustrated by the various cases shown in the same figure.Specifically, the smaller orange boxes are representative of subboxes obtained upon gridding of the two green boxes.Figure2(A)shows a (, , )-consistent configuration composed of three components and Figure2(B)-(D)show (, , )-consistent configurations composed of two components.Figure2(E)-(H) show (, , )consistent configurations with just a single component and in these cases, the leg mappings are (, )-consistent.

C L A I M 5 . 7 .
If | 1 | ≥  1 ||, then with high probability, Algorithm 3 finds a -legged -appearance or a -appearance in Step 7.P R O O F .The number of 1-component configuration copies in the grid   ′ , ′ that share a dense box and contain at most  boxes is at most ( − 1)! • (2) −1 .Combined with the fact that the total number of dense boxes is at most  ′ , we can see that the number of distinct copies of 1-component configurations with at most  boxes is at most  ′ • ( − 1)! • (2) −1 .

18, 12, 15, 6, 16, 30].
The size of box(, ) is defined to be ||.A box is nonempty if it contains at least one point and is empty otherwise.Consider an arbitrary collection of pairwise disjoint contiguous value intervals L = { 1 , . . .  }, such that  ⊆ ∈[]   .The set L naturally defines a partition of the points in box(, ) into  horizontal layers, box(,   ) for  ∈ [].Assume that, in addition to a set of layers L, we have a partition of  into disjoint intervals  =  =1   where   = [  ,   ], and   <  +1 ,  = 1, . . . − 1.The family S = { 1 , . . .  } induces a partition of box(, ) and the points in it, into  vertical stripes, box(  , ) for  ∈ [].The layering defined by L together with the stripes defined by S partition box(, ) into a coarse grid  , of boxes {box(  ,   )} , ∈[]

O N 2 . 6.
Let  ∈ S  .Let  1 , . . .,   be a set of boxes forming one component  and  : [] ↦ → { 1 , . . .,   } be an arbitrary mapping of the legs of a -appearance to boxes.We say that  has a -legged -appearance if there is a -appearance in  =1   in which for each  ∈ [], the -th leg of  appears in the box  () .
O B

S E R VAT I O N 2 . 7. Let
∈ S  and assume that there exists a -appearance in   that, in the grid of boxes  , , forms a configuration C that contains  components C 1 , . . ., C  .Let the restriction of this -appearance to C 1 , . . ., C  define the permutation patterns  1 , . . .  with leg mappings  1 , . . .  , respectively.Then any collection { ′  and  =1  ′  is a copy of C, along with   -legged   -appearances in  ′  for each  ∈ [] defines a -appearance in  =1  ′ Since ∈[]  ′  form a copy of the configuration C, for any two boxes  1 and  2 belonging to ∈[]  ′  , their relative position in the grid  , is identical to the relative position of the corresponding boxes in C.
:  ∈ []} such that  ′  is a configuration copy of C  .P R O O F .
[16]ure-resilient (ER) testing, introduced by Dixit, Raskhodnikova, Thakurta and Varma[16], is a generalization of property testing.In this model, algorithms get oracle access to functions for which the values of at most  fraction of the points in the domain are erased by an adversary, for  ∈ [0, 1).For  : [] → R let NE(  ) be the nonerased values of  .The parameter  is given as an input to the algorithms, but, they do not know NE(  ).On querying a point, the algorithm receives the function value if the point is nonerased, and a special symbol otherwise.For  ∈ (0, 1),  ∈ [0, 1), an -erasure-resilient (-ER) -tester for P  is a randomized algorithm that on oracle access to a function  : [] → R, accepts, with probability 1, if  | NE(  ) is -free, and rejects, with probability at least 2/3, if there is a matching of size / of -appearances in NE(  ).
Therefore, the union of   -legged   -appearances in  ′  for  ∈ [] defines a -appearance.■ 2.3 Erasure-resilient testing D E F I N I T I O N 2 .8 (One-sided error erasure-resilient tester for P  ).
× |(  )| grid   consisting of points (,  ()) for  ∈ [], where, in particular, (  ) is neither known nor bounded.Our first goal is to approximate   by a coarse grid of boxes   ′ , ′ , where  = √  and  ′ = Θ().This is done by first querying  on Θ() independently sampled and uniformly random indices, upon which we obtain a partition L of (  ) into  ′ horizontal layers, corresponding to value intervals {  } ∈[ ′ ] .Then, we partition the index set [] into  ′ contiguous intervals {  } ∈[ ′ ] of equal sizes.This results in a grid   ′ , ′ , where a box box(  ,   ), ,  ∈ [ ′ ] is tagged as nonempty if it has at least one sampled point.A box is tagged as dense if it contains Ω  (1)-fraction of the sampled points in its stripe.All of the above takes Õ () = Õ ( √ ) queries.The following properties are satisfied with high probability: Each layer, that is box( [],   ),  ∈ [ ′ ], has approximately the same number of points from   .
and let a constant fraction of the -appearances in  1 be in a singlebox component.Then, on average, a dense box, out of the Θ( ′ ) dense boxes, is expected tocontain at least Θ  (/ ′ ) = Θ  ( ′ ) = Θ  ( √ ) many -appearances.Thus a random dense box  is likely to have Θ  ( √ ) many -appearances, and hence, making queries to all points of such a box will enable us to find one such -appearance.This takes an additional / ′ = Θ( -appearance and  2 ,  3 contain its 1 and 4 legs, respectively (see Figure1(G) for an example).In this case  1 is not (2, 1)-free (as  1 contains the (3, 2)-subpattern of ).similar argument holds for a 3-component configuration C ′ in which one component contains more than one box.Let C ′ consist of two single-box components and a two-boxed component, as in Figure1(D).In this case, a similar averaging argument shows the existence of a dense box  for which (a) there is a dense box  ′ forming a component  with , and dense boxes  2 ,  3 such that ,  2 ,  3 jointly form a copy of C ′ , and (b) there are Ω  (/) = Ω  ( √ )-legged (2, 1)-appearances in , where  is such that the 2-leg maps to the upper box in  and the 1-leg maps to the lower box in .Hence, the test is similar to the simpler case above.We test for every dense box  and every way to extend it into a component of two boxes by adding a box  ′ (a constant number of ways) such that  = (,  ′ ) contains a -legged (2, 1)-appearance.This again can be done using (log ) queries per component copy .Once this is done, finding ,  2 ,  3 that form a copy of C ′ results in a -appearance by Observation 2.7.Assume now that | 2 | ≥  ′ /3, and that the corresponding configurations of the -appearances in  2 contain two single-box components  1 ,  2 , where  1 holds the first 3 legs of  and  2 holds the 4-th leg.E.g., For  = (3, 2, 1, 4), the configuration C contains two boxes  1 ,  2 where  1 contains the subpattern (3, 2, 1) and  2 is any nonempty box such that  1 <  2 , queries, which is within the query budget.Next, consider the case that a constant fraction of the -appearances in  1 belong to a configuration C that has more than one dense box (but only one connected component).An example of such a situation would be Figure1(J).By a similar argument, a random dense box is expected to participate in at least Θ  (/) many -appearances of copies of configurationtype C. Since each dense box is part of at most ( 3 ) (constantly many) connected components of at most 4 dense boxes, sampling a random dense box  and querying all the indices in each of the components that contain at most 4 dense boxes and involve , is likely to find a -appearance with high probability.Each connected component is over at most 4/ ′ indices, resulting in (/) queries.Case 2: | 3 | ≥  ′ /3, and assume first that a constant fraction of the members in  3 belong to copies of a configuration C of 3 components  1 ,  2 ,  3 , where each one is a single box.Since the boxes  1 ,  2 ,  3 belong to different components, no two of them share a layer or a stripe.For our current working example,  = (3, 2, 1, 4), assume further that  1 contains the 3, 2 legs of a by the guarantee above we will find the corresponding ,  2 and  3 and a -appearance in it (by Observation 2.7 with the trivial mapping).A (see Figure1(H) for an illustration).An averaging argument, as made in Case 2, shows that there is a dense box  1 for which (a)  1 is far from (3, 2, 1)-free, and (b) there is a corresponding dense box  2 that, together with  1 , forms a copy of the configuration C.This suggests a test that is conceptually similar to the test in Cases 1 and 2. We test each box for being (3, 2, 1)-free.
beyond  = 4. Applying the same ideas to  ∈ S  ,  ≥ 5 works essentially the same way, provided we can test for -legged -freeness of  ∈ S  for  < .This we know how to do for  ∈ S 2 but not beyond.For  = 2, testing -legged -freeness of  ∈ S 2 is simpler than testing monotonicity for nontrivial , and is equivalent to testing monotonicity when testing is done in a one-boxed component.Hence, this can be done in (log ) queries.For  ≥ 3 the exact complexity is currently not know.The way we solve this generalized problem is very similar, conceptually, to the way we solve the unrestricted problem; we decompose  into an  ×  grid of subboxes, , , by performing gridding of .Then, we either find  in  , , or, using Lemma 3.1, conclude that there are only linearly many dense subboxes in  , .At that point, we find a -legged -appearance by reducing it to the same problem of a  ′ -legged  ′ -freeness of smaller In particular, one difficulty is that after gridding, a superlinear number of nonempty boxes does not guarantee such appearance, as Lemma 3.1 does not apply.For example, for even , consider the grid [] × [] all of whose points in the top left quarter {1, . . ., /2} × {/2 + 1, . . ., } and right bottom quarter {/2 + 1, . . ., } × {1, . . ., /2} are marked.There are no restricted (1, 2)-appearances among the marked points where the 1 leg is from the right half and the 2-leg is from the left half, despite there being Ω( 2 ) points.However, for our goal of testing -freeness for  ∈ S  , we can relax the task of finding -legged -freeness of  ∈ S  ,  ≤  to the following problem which we call "generalized-testing  w.r.t.", denoted GeneralizedTesting  (, , ): The inputs are a permutation  ∈ S  , a component , and a leg mapping .Our goal is to find either a -legged -appearance OR a -appearance in .

1 :
Sample a set of  log 4  indices from  uniformly and independently at random.: Let  denote the multiset of points in the sample that belong to box(, ) and let  denote the cardinality of  including multiplicities.If  <  log 2 , then FAIL.3: We sort the multiset of values  = {  () :  ∈ } to form a strictly increasing sequence seq = ( ′ 1 < . . .<  ′  ), where, with each  ∈ [], we associate a weight   that equals the multiplicity of  ′  in the multiset  Note that ∈[]   = .4: We now partition the sequence  = ( 1 , . . .,   ) into maximal disjoint contiguous subsequences  1 , . . .  ′′ such that for each  ∈ [ ′′ ], either ∈   < 2/, or   contains only one member  for which  > /.⊲ This can be done greedily as follows. 2⊲ 1 8 -th of the threshold  for marking a box as dense.Input:  ⊆ [] is a union of disjoint stripes,  ⊆ (  ) is a disjoint union of intervals of values in (  ),  = box(, ) is the domain on which we do gridding,  is a parameter defining the 'coarse' grid size,  < 1 is a density threshold.Output: A grid of boxes   ′ , ′ ,  ′ ≤ 2 in which there will be Õ( ′ ) marked boxes.: Call Layering (Algorithm 1) on inputs , , .This returns, with high probability, a set I of  ′ ≤ 2 value intervals  = ∈[ ′ ]   that forms a nice -partition of box(, ).2: Partition  into  ′ contiguous intervals  1 , . . .  ′ each of size ||/ ′ .This defines the grid of boxes   ′ , ′ = {box(  ,   ) : (, ) ∈ [ ′ ] 2 } inside the larger box box(, ).  ,  ∈ [ ′ ].For each (, ) ∈ [ ′ ] 2 , if box(  ,   ) contains a sampled point, then tag that box as marked.If box(  ,   ) contains at least 3/4 fraction of the sampled points in the stripe   , tag that box as dense. 1 For every  ∈ [ ′ ], either the stripe box(  , ) contains at least log 2  100 2 marked boxes, or the number of points in the marked boxes in box(  , ) is at least (1 − 1 log 2  ) • |  |.Every box that is tagged dense has density at least /8, and every box of density at least  is tagged as dense.
AlgTest  (, , , , ) finds either a -legged -appearance or a -appearance, with probability at least 1 − (1).Its query complexity is Õ     Θ(  ), where the Õ(•) notation hides polylogarithmic factors in .We note that since AlgTest  (, , , , ) either finds a -appearance or a -legged appearance in , then if  is free of -legged -appearances, the algorithm will never return such an appearance.Our -freeness tester is simply AlgTest  (, ,   , , ), where  is the constant function mapping each leg to the entire grid   and  =  1/ for an integer parameter  ≤ log log log  that we can control.C O R O L L A R Y 5 .4. There is a 1-sided error test for -freeness of functions of the form  : [] → R, for every  ∈ S  , with query-complexity Õ(   Θ(  ) We do not specify  since, as explained above,  is only needed at Step 4 of the algorithm when the number of marked boxes is superlinear in  in some recursive call.The argument here holds for any  ∈ S  ,  ≥ 4.Algorithm to test -legged -freeness.Let  = (1, 3, 2) and  be such that (1) = (3) =  1 and (2) =  2 .1.We assume that  1 ,  2 are over  ≤  indices each, and that the distance of  1 ∪  2 from legged -freeness is at least  = Ω(1).In particular  1 ,  2 are dense.In Step 3 of Algorithm 3, we grid the appropriate box containing  1 ∪  2 (as defined in Step 1 of Algorithm 3) into a  ′ ×  ′ grid,   ′ , ′ , of subboxes (each over 2/ ′ indices), where  ≤  ′ ≤ 2.We either find a -appearance among the sampled points or we may assume, after Steps 4 and 5 that there are ( ′ ) dense subboxes in   ′ , ′ and that each layer and each stripe contains (1) For example, in Figure2(B) the configuration C 2 contains one component  1 = ( 1,3 ,  2,2 ), where  1,3 ∈  1 ,  2,2 ∈  2 , and another single boxed component  1,1 ∈  1 , where  ,  is the orange subbox contained within the green box   and such that the -th leg belongs to  ,  for  ∈ [2],  ∈ [3].
dense boxes.The latter claim is obtained by an averaging argument and is described in the formal proof in Section 5.1.2.The argument is that if  1 ∪  2 contains a large matching of -legged -appearances, then so does the restricted domain after deleting points from non-dense boxes as well as and deleting layers and stripes that contain too many dense boxes from   ′ , ′ .These steps take Õ() queries overall, which is the complexity of the algorithm Gridding.2.A -legged -appearance in  1 ∪  2 can be in 8 possible configurations in the grid   ′ , ′ , as depicted in Figure2.Consider first C 1 , ..., C 4 as in Figure2(A)-(D), that form 2 or 3 components each.For these, a -legged -appearance in  1 ∪  2 decomposes into two or three subpatterns, and for which any restricted appearances in the corresponding components results in a -legged -appearance.
If  is -far from -legged -free, then the union of dense boxes that remain after Step 5 in Algorithm 3 contains a matching  ′ of -legged -appearances of cardinality at least 9|| 10 .Since   ′ , ′ contains at most  ′ marked boxes, it follows that at most   =  100 fraction of the layers have more than  marked boxes.Hence, using the bound on the density of each layer (due to the nice -partition of box(, )) from Claim 4.4, deleting the points in these marked boxes deletes at most  100  ′ • 4  • || ≤ 8|| 100 points from .By a similar argument, the number of points that get deleted by removing stripes with more than  marked boxes is at most || 100 .Moreover, by the third item of Claim 4.4, we know that the total number of points that belong to marked boxes that are not tagged dense by AlgTest is at most  ||  ′ •  ′ ≤ || 200 , where the inequality follows by our setting of .Finally, combining the second item in Claim 4.4 with the fact that we delete each stripe containing more than  = 100  marked boxes, for each stripe that is left, the marked boxes contain at least 1 − 1/(log 2 ) fraction of the points in it.Hence, the total number of points deleted in Step 5 of Algorithm 3 is at most C L A I M 5 .6.P R O O F .10 .