123
返回列表 发新帖
楼主: Sky-Tiger

Google? Evil? You have no idea

[复制链接]
论坛徽章:
350
2006年度最佳版主
日期:2007-01-24 12:56:49NBA大富翁
日期:2008-04-21 22:57:29地主之星
日期:2008-11-17 19:37:352008年度最佳版主
日期:2009-03-26 09:33:53股神
日期:2009-04-01 10:05:56NBA季后赛大富翁
日期:2009-06-16 11:48:01NBA季后赛大富翁
日期:2009-06-16 11:48:01ITPUB年度最佳版主
日期:2011-04-08 18:37:09ITPUB年度最佳版主
日期:2011-12-28 15:24:18ITPUB年度最佳技术原创精华奖
日期:2012-03-13 17:12:05
21#
 楼主| 发表于 2014-5-3 23:43 | 只看该作者
Another frequent operation on a data set you may do in your day-to-day job is grouping items in a set, based on the values of one or more of their properties. As you saw in the earlier transactions currency–grouping example, this operation can be cumbersome, verbose, and error prone when implemented with an imperative style, but it can be easily translated in a single, very readable statement by rewriting it in a more functional style as encouraged by Java 8. To give a second example of how this feature works, suppose you want to classify the dishes in the menu according to their respective type, putting the ones containing meat in a group, the ones with fish in another group, and all others in a third group. You can easily
perform this task using a Collector created with the Collectors.groupingBy factory method as follows:
Map<Dish.Type, List<Dish>> dishesByType = menu.stream().collect(groupingBy(Dish::getType));
This will result in the following Map:
{FISH=[prawns, salmon], OTHER=[french fries, rice, season fruit, pizza],
   MEAT=[pork, beef, chicken]}
Here, you pass to the groupingBy method a Function (expressed in the form of a method reference) extracting from each Dish in the Stream the corresponding Dish.Type. We call this Function a classification function because it’s used to classify the elements of the Stream in different groups. More in general, you have to create a Collector that passes to the groupingBy method a classification Function that transforms each item in the Stream into the value under which the item itself will be classified. The result of this grouping operation, shown in figure 5.4, is a Map, having as key the value returned by the classification Function and as a corresponding value a List of all the items in the Stream for which the application of the classification Function on that item returns that value. In the menu-classification example a key is the type of dish, and its value is a List containing all the dishes of that type.

使用道具 举报

回复
论坛徽章:
350
2006年度最佳版主
日期:2007-01-24 12:56:49NBA大富翁
日期:2008-04-21 22:57:29地主之星
日期:2008-11-17 19:37:352008年度最佳版主
日期:2009-03-26 09:33:53股神
日期:2009-04-01 10:05:56NBA季后赛大富翁
日期:2009-06-16 11:48:01NBA季后赛大富翁
日期:2009-06-16 11:48:01ITPUB年度最佳版主
日期:2011-04-08 18:37:09ITPUB年度最佳版主
日期:2011-12-28 15:24:18ITPUB年度最佳技术原创精华奖
日期:2012-03-13 17:12:05
22#
 楼主| 发表于 2014-5-3 23:49 | 只看该作者
But it isn’t always possible to use a method reference as a classification Function, because it could be something more complex than a simple property accessor. For instance, you could decide to classify as “diet” all dishes with 400 calories or fewer, set to “normal” the dishes having between 400 and 700 calories, and set to “fat” the ones with more than 700 calories. Since the author of the Dish class unhelpfully didn’t provide such an operation as a method, you can’t use a method reference in this case, but you can express this logic in a lambda expression:
public enum CaloricLevel { DIET, NORMAL, FAT }
Map<CaloricLevel, List<Dish>> dishesByCaloricLevel = menu.stream().collect( groupingBy(dish -> {
if (dish.getCalories() <= 400) return CaloricLevel.DIET;
else if (dish.getCalories() <= 700) return CaloricLevel.NORMAL;
           else return CaloricLevel.FAT;
            } ));
So now you’ve seen how to group the dishes in the menu, both by their type and by calories, but what if you want to use both criteria at the same time?

使用道具 举报

回复
论坛徽章:
350
2006年度最佳版主
日期:2007-01-24 12:56:49NBA大富翁
日期:2008-04-21 22:57:29地主之星
日期:2008-11-17 19:37:352008年度最佳版主
日期:2009-03-26 09:33:53股神
日期:2009-04-01 10:05:56NBA季后赛大富翁
日期:2009-06-16 11:48:01NBA季后赛大富翁
日期:2009-06-16 11:48:01ITPUB年度最佳版主
日期:2011-04-08 18:37:09ITPUB年度最佳版主
日期:2011-12-28 15:24:18ITPUB年度最佳技术原创精华奖
日期:2012-03-13 17:12:05
23#
 楼主| 发表于 2014-5-5 17:06 | 只看该作者
In general it’s impossible (and pointless) to try to give any quantitative hint on when to use a parallel Stream because any suggestion like “use a parallel Stream only if you have at least 1 thousand (or 1 million or whatever number you want) elements” could be correct for a specific operation running on a specific machine, but it could be completely wrong in an even marginally different context. But it’s at least possible to provide some qualitative advice that could be useful when deciding if it makes sense to use a parallel Stream in a certain situation:
If in doubt, measure. Turning a sequential Stream into a parallel one is trivial but not always the right thing to do. As we already demonstrated in this section, a parallel Stream isn’t always faster than the corresponding sequential version. Moreover, parallel Streams can sometimes work in a counterintuitive way, so the first and most important suggestion when choosing between sequential and parallel Streams is to always check their performance with an appropriate benchmark.
Watch out for boxing. Automatic boxing and unboxing operations can dramatically hurt performance. Primitive Streams have been included for this reason, and the performance benefits in employing them every time it’s possible to do so can often overcome the advantages provided by parallel Streams.

使用道具 举报

回复
论坛徽章:
350
2006年度最佳版主
日期:2007-01-24 12:56:49NBA大富翁
日期:2008-04-21 22:57:29地主之星
日期:2008-11-17 19:37:352008年度最佳版主
日期:2009-03-26 09:33:53股神
日期:2009-04-01 10:05:56NBA季后赛大富翁
日期:2009-06-16 11:48:01NBA季后赛大富翁
日期:2009-06-16 11:48:01ITPUB年度最佳版主
日期:2011-04-08 18:37:09ITPUB年度最佳版主
日期:2011-12-28 15:24:18ITPUB年度最佳技术原创精华奖
日期:2012-03-13 17:12:05
24#
 楼主| 发表于 2014-5-5 17:06 | 只看该作者
Some operations naturally perform worse on a parallel Stream than on a sequential Stream. In particular operations such as limit and findFirst that rely on the order of the elements are expensive in a parallel Stream. For example findAny will perform better than findFirst because it is not constrained to operate in the encounter order. You can always turn an ordered Stream into an unordered Stream by invoking the method unordered() on it. So for instance if you need N elements of your Stream and you’re not necessarily interested in the first N ones, calling limit on an unordered parallel Stream may execute more efficiently than on an Stream with an encounter order (e.g. the source is a List).

使用道具 举报

回复
论坛徽章:
350
2006年度最佳版主
日期:2007-01-24 12:56:49NBA大富翁
日期:2008-04-21 22:57:29地主之星
日期:2008-11-17 19:37:352008年度最佳版主
日期:2009-03-26 09:33:53股神
日期:2009-04-01 10:05:56NBA季后赛大富翁
日期:2009-06-16 11:48:01NBA季后赛大富翁
日期:2009-06-16 11:48:01ITPUB年度最佳版主
日期:2011-04-08 18:37:09ITPUB年度最佳版主
日期:2011-12-28 15:24:18ITPUB年度最佳技术原创精华奖
日期:2012-03-13 17:12:05
25#
 楼主| 发表于 2014-5-5 17:06 | 只看该作者
Consider the total computational cost of the pipeline of operations performed by the Stream. With N being the number of elements to be processed and Q the approximate cost of processing one of this element through the Stream pipeline, the product of N*Q gives a rough qualitative estimation of this cost. A higher value for this cost implies a better chance of good performance when using a parallel Stream.

使用道具 举报

回复
论坛徽章:
350
2006年度最佳版主
日期:2007-01-24 12:56:49NBA大富翁
日期:2008-04-21 22:57:29地主之星
日期:2008-11-17 19:37:352008年度最佳版主
日期:2009-03-26 09:33:53股神
日期:2009-04-01 10:05:56NBA季后赛大富翁
日期:2009-06-16 11:48:01NBA季后赛大富翁
日期:2009-06-16 11:48:01ITPUB年度最佳版主
日期:2011-04-08 18:37:09ITPUB年度最佳版主
日期:2011-12-28 15:24:18ITPUB年度最佳技术原创精华奖
日期:2012-03-13 17:12:05
26#
 楼主| 发表于 2014-5-5 17:07 | 只看该作者
For small amount of data, choosing a parallel Stream is almost never a winning decision. The advantages of processing in parallel only a few elements aren’t enough to compensate for the additional cost introduced by the parallelization process.

使用道具 举报

回复
论坛徽章:
350
2006年度最佳版主
日期:2007-01-24 12:56:49NBA大富翁
日期:2008-04-21 22:57:29地主之星
日期:2008-11-17 19:37:352008年度最佳版主
日期:2009-03-26 09:33:53股神
日期:2009-04-01 10:05:56NBA季后赛大富翁
日期:2009-06-16 11:48:01NBA季后赛大富翁
日期:2009-06-16 11:48:01ITPUB年度最佳版主
日期:2011-04-08 18:37:09ITPUB年度最佳版主
日期:2011-12-28 15:24:18ITPUB年度最佳技术原创精华奖
日期:2012-03-13 17:12:05
27#
 楼主| 发表于 2014-5-5 17:07 | 只看该作者
Take into account how well the data structure underlying the Stream decomposes. For instance, an ArrayList can be split much more efficiently than a LinkedList, because the first can be evenly divided without traversing it, as it is necessary to do with the second. Also, the primitive Streams created with the range() factory method can be decomposed very quickly. Finally, as you’ll learn in section 6.3, you can get full control of this decomposition process by implementing your own Spliterator.

使用道具 举报

回复
论坛徽章:
350
2006年度最佳版主
日期:2007-01-24 12:56:49NBA大富翁
日期:2008-04-21 22:57:29地主之星
日期:2008-11-17 19:37:352008年度最佳版主
日期:2009-03-26 09:33:53股神
日期:2009-04-01 10:05:56NBA季后赛大富翁
日期:2009-06-16 11:48:01NBA季后赛大富翁
日期:2009-06-16 11:48:01ITPUB年度最佳版主
日期:2011-04-08 18:37:09ITPUB年度最佳版主
日期:2011-12-28 15:24:18ITPUB年度最佳技术原创精华奖
日期:2012-03-13 17:12:05
28#
 楼主| 发表于 2014-5-5 17:14 | 只看该作者
The characteristics of a Stream, and how the intermediate operations through the pipeline modify them, can change the performance of the decomposition process. For example, a SIZED Stream can be divided into two equal parts, and then each part can be processed in parallel more effectively, but a filter operation can throw away an unpredictable number of elements, making the size of the Stream itself unknown.
Consider whether a terminal operation has a cheap or expensive merge step. In the second case, the cost caused by the re-aggregation of the partial results generated by each substream can negatively affect the performance of a parallel Stream.

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表